Developers need higher performance databases to unlock the full potential of exciting but ever more data-hungry applications.
IDC forecasts that the global datasphere will grow from 45 zettabytes in 2019 to 175 zettabytes by 2025. Furthermore, the analysts expect that around 30 percent of the world’s data will need real-time processing.
“Today, more than five billion consumers interact with data every day — by 2025, that number will be six billion, or 75 percent of the world’s population. In 2025, each connected person will have at least one data interaction every 18 seconds” – The Digitization of the World.
Ryan spoke with Nicolas Hourcard (left), CEO and Co-Founder of QuestDB, on the advantages of using a time-series database to help achieve the levels of performance required to meet such demands.
Developer: What inspired you to launch QuestDB?
Nicolas Hourcard: Our CTO worked in electronic trading for more than 10 years and dealt intimately with databases to power such systems. In 2013, his boss would not allow him to use the only high-performance database suited to deal with time-series data because of its proprietary nature and price.
His first goal was to build a time-series database to democratise this sort of performance, until then only accessible to a small group of developers in trading.
The second goal was to make QuestDB very accessible through a language that every developer can use: SQL. The pillars of QuestDB are, therefore, extreme performance, an open-source distribution model, and native SQL support.
Developer: What are the advantages of a time-series database?
NH: Time-series databases continuously accumulate data points over time. As use cases which generate data suitable for time series analysis are increasing exponentially, so is the amount of raw data itself.
Traditional databases lack the ability to efficiently store and give access to such a high volume of data. Purpose-built time-series databases feature performant ingestion rate (WRITE operations) and can generally retrieve information over time efficiently (READ operations).
Beyond performance aspects, time-series databases make it easy to search data over time with dedicated functions and syntax. Time-series databases should be able to do downsampling, time-series joins to correlate different series over time, interval search, and be able to ingest unstructured data via specialised protocols such as the InfluxDB line protocol.
Developer: What differentiates QuestDB to the usual suspects such as InfluxDB or TimescaleDB?
NH: In one word: performance. This sort of performance could be achieved because we have built our stack from the ground up; with zero dependencies. Indeed, any database is as fast as its slowest components and we do not depend on a platform that has not been designed to handle time-series data.
Further, we have also been implementing techniques found in low latency trading software. We store data in columns and partition it by time, only lifting the amount of data needed and we implement SIMD instructions to execute multiple operations in parallel.
We rely heavily on parallelisation, being able to slice data coming in multiple chunks and ingest it all simultaneously. We have put a 1.6 billion row dataset with 10 years worth of NYC taxi rides with weather data on our website for users to experience lightning-fast millisecond queries.
Developer: On r/programming I noticed a complaint that QuestDB is missing Grafana. It’s my understanding that’s now supported. Was that a response to demand or always in the pipeline?
NH: It always had been in the pipeline, but our community kept asking for it and we thus prioritised this integration based on feedback. You can now visualise data on QuestDB through Grafana’s dashboard on the fly.
Developer: PostgreSQL Wire Protocol was another popular request which has since been added. Are there any other recent additions which you’re particularly proud of?
NH: Being able to provide access to the entire Postgres ecosystem to our users out of the box was one of the priorities for us. Through the PostgreSQL wire you can, for example, connect to Grafana or subscribe to topics from Kafka. Soon enough, all major BI tools will be supported.
Another integration we see as paramount for our user base is our native InfluxDB Line Protocol – InfluxDB users can send the same unstructured data (following the tag/set model) to QuestDB without having to specify a schema in advance. This makes it very easy to try QuestDB in parallel with InfluxDB and eventually switch from one to the other as a drop-in replacement.
Developer: You probably hate this question – but with Ethereum 2.0 finally beginning its initial rollout – what are your thoughts on blockchain and distributed ledger technologies?
NH: Our founding team has spent some time in crypto and we believe that QuestDB can fulfil all requirements for market data and machine learning purposes. I am a believer in Bitcoin and other permissionless networks such as Ethereum because censorship resistance is very relevant for a multitude of use cases.
However, for most enterprise use cases, scalability would quickly become an issue, and it does not seem like enterprise blockchain technologies are anywhere near the performance of centralised databases.
Developer: In a recent blog post, you said the end of Moore’s Law is in sight and the onus will soon be on developers to ensure they’re writing efficient code rather than relying on hardware advances. Do you think there’s ample room to make software efficiency improvements to handle the exponential propagation of data in the coming years?
NH: The explosion of data and hardware-related costs coupled with environmental concerns will drive more emphasis on extracting performance through carefully-written software. It’s becoming increasingly difficult to improve CPU processing power as we’re actually approaching the physical limits of the hardware itself.
Throwing more machines or cloud resources at the problem is not a sustainable solution. Writing code that executes faster is one major answer.
We believe that there have also been fewer incentives for developers to write lean code as a side-effect of advances in computing power in recent decades. The mindset of reusing existing libraries to build a product may also need to evolve, and this will take some time. But the forces mentioned above will catalyse this change.
Developer: You pulled off a successful $2.3 million seed round earlier this year during incredibly challenging economic times—an impressive feat and testament to investors’ confidence in QuestDB. What’s next for the company?
NH: We are building our community and seeing the rate of adoption from enterprises faster than anticipated.
As we keep building this momentum, we are going to hire more developers next year, push several key features (such as replication), and roll out our enterprise offering for our paying customers.
You can find out more about QuestDB here.
(Photo by Shawn Lee on Unsplash)
Interested in hearing industry leaders discuss subjects like this? Attend the co-located 5G Expo, IoT Tech Expo, Blockchain Expo, AI & Big Data Expo, and Cyber Security & Cloud Expo World Series with upcoming events in Silicon Valley, London, and Amsterdam.