Revolutionizing Database Technology: GSFRS
Written by Gitika Gorthi, Chantilly High School
Has there ever been a day where you were free from the usage of all electronic devices? Won’t the day feel dry without some form of digital juice? A lot of the data dealt with these days are all classified as big data, collections of large and diverse data that is more sizable than traditional databases. Have you ever had a moment when a large software took a long time to load or open a particular file, and the computer seemed stuck on repeat at that one page? If so, you are not the only one. Whether it be students, researchers, or even technologies such as rovers, fast data processing, and analytics is always a desirable outcome. What if we told you that there is a novel, portable, and highly efficient rapid data access data tool that can allow near-real-time access to any part of a large-sized data file? You might be thinking, is that possible? The answer is yes, it is possible. A tool named Giant Signal File Random Samples (GSFRS), a software developed for other softwares, is the solution to the slow processing concerns we are facing today. GSFRS is a tool that loads small parts of data without loading the entire file into the computer memory unlike present-day practices through two fundamental steps: indexing and parsing.
You might wonder, how does the GSFRS help the current technology we have now? How can it be applied in the real world? How does it impact our planet? Lately in the news, information on the recently landed National Aeronautics and Space Administration (NASA) Perseverance Rover on Mars has been historical and many new discoveries are hoped to be made on the red planet. However, how does the rover move? The main power source for rovers are their multi-panel solar arrays; however, when there is no solar energy, like at night, the rovers contain two rechargeable batteries, which eventually degrade. Faster input and output of data using GSFRS as a software processor will enable a lower energy footprint and thus will preserve the non-renewable power, which can then be utilized for other functions. And that is how a longer-lasting rover can be developed. If that did not convince you of GSFRS’s potential, let’s talk about submarines. Just in the United States, 71 submarines are currently active and in order to keep them running abundant energy is required. The electrical equipment on submarines is usually run off batteries, and therefore if less batteries are consumed, more energy will be saved. The GSFRS reduces the number of computations that have to be made due to its high portability for electronic equipment, reducing battery activity.
Figure 1: NASA Mars Perseverance Figure 2: Submarine electricity production systems
Now the question that may come to mind is how does saving energy by a few minutes or seconds really make that big of a difference? The GSFRS being used at once for one appliance may only save a small, almost insignificant percent of energy; however, if the GSFRS is used long-term in most appliances and current softwares --both on Earth and in space--, the amount of energy saved can be enough to significantly reduce global warming. Imagine, current data collected displays that annually 5,000 hours are spent on electronic devices; if this number can be reduced due to increased productivity, imagine the large positive change it can create.
With Earth Day that recently happened on April 22nd, the importance of protecting our planet and decelerating climate change has been on a lot of our minds. Climate change is the drastic change in the average weather in a particular location, and a major factor of climate change is human emissions of greenhouse gasses. By reducing the daily energy used, the electricity needed to be produced also decreases, which in turn reduces one’s carbon footprint and release of greenhouse gas emissions. More than we realize, our footprint on this planet is large, and to minimize our footprints, utilizing software such as GSFRS to tackle big data processing in other software will save much energy.
Figure 3: Simple diagram of how in the long run the climate can be positively impacted through the large use of GSFRS.
Data centers that account for a large amount of energy costs and usage are facilities such as the Ernest Orlando Lawrence Berkeley National Laboratory, U.S. Energy Information Administration, and Resources for the Future. The table below illustrates one report from the Berkeley National Laboratory on energy consumption, and it demonstrates that energy consumption has and will continue to exponentially increase over the years. It is our job to stay under the predicted curves through proactive measures, such as less energy usage and screen time.
Figure 4: Estimates include energy used for servers, storage, network equipment, and infrastructure in all U.S. data centers. The solid line represents historical estimates from 2000-2014 and the dashed lines represent five projection scenarios through 2020; Current Trends, Improved Management (IM), Best Practices (BP), Hyperscale Shift (HS), and the static 2010 Energy Efficiency.
By behaving as a tool that samples data without loading the entire file in the memory unlike present-day practices, GSFRS provides an efficient method of processing only the necessary information. Random accessibility makes various parts of a data source available to parallel processing in a multi-threaded environment and thus helps us make use of optimal hardware resources. Apart from the algorithm, the highly sophisticated features of GSFRS are crafted by carefully utilizing the very modern C++20 standard with features like move semantics, filesystem, lambda function, and multi-threading.
Comments
Post a Comment