At any given time, NASA has more than a hundred satellites and devices on Earth and throughout the solar system collecting data. For NASA’s Jet Propulsion Laboratory (JPL), managing, analyzing, and storing this data is a round-the-clock job.
The data that JPL works with is critical to understanding how the planets and the universe function. “We’re here to provide scientifically valuable research for the world,” says Rob Witoff, an IT data scientist for JPL. “We collect data for the big questions we’re trying to answer.”
Since the data NASA gathers is indispensable, the organization faces its own set of hardware and software challenges. One such challenge: how to recognize what data is important and what’s useless.
JPL constantly receives data from unmanned rovers and landers, space- and Earth-based telescopes and myriad satellites and probes. With the amount of data being captured, there is no way NASA’s scientists can go through the data themselves, especially if that data is quickly erased to make room for more. According to Kiri Wagstaff, who works in the machine learning systems sector of JPL, the solution is to automate the process by formulating data-scanning algorithms.