Life happens in real-time. From breaking news to breaking servers, real-world events require petabytes of data to be collected, analyzed and acted on as those events happen. Until recently, real-time data processing was an exotic practice, relegated to narrow domains like time-series analysis for financial markets. For most of us performing analytics — and particularly web analytics — batch processing has been the dominant approach.The ability to process large quantities of incoming data is critical in high-speed applications to avoid overwhelming other computers or storage devices in a system.High-speed processing allows:
- scanning for trigger events and sending only interesting data to the network
- improving data quality by performing real-time averaging or digital filtering
- numerically intensive data pre-processing avoid overloading processors of receiving machines.