Who will rule 2020; Trump or Serverless Streams?
Political pioneers all over the world keep on staying dominant and prevailing powers in worldwide matters. It's no surprise that US President Donald Trump is one of the most influential people who remains popular across the news due to his radical decisions. Even the globe's most formidable can't just ignore the fact that analytics play a principal role in everything. However, we ain't here about Trump or his propaganda—we are here about the future of decision making for all it matters.
About a year ago, we started being a part of the digital transformation with the first ever cloud-based IDE for serverless development. It was no cakewalk—we’ve been burning the candle at both ends trying to cover a majority from the AWS’s serverless stack. Working with AWS Kinesis made me realise the beauty of serverless—of course, the exposure to streaming data with Kafka spared me some time going through the rudiments.
Long story short
Did you ever wonder...
- How "Google Search" suggests you when you’re half-typing your query?
- How "Cheapest Airlines" starting to appear everywhere after you searched for a country?
- How online role-playing games adjust according to your decisions?
- How do gambling sites predict the odds of a live game?
- Why were Curry and Thompson benched while Portland was handing the Warriors their worst loss in 73-win NBA season?
The power of real-time streaming data analytics is astonishing indeed. Now, since serverless technology is gaining some momentum, maybe you won’t have to worry about taking risky decisions on your own at all. This post covers the basics of "Serverless Streaming Data Processing" and how it will be an influential component of our decision making in the future.
A Series of Streaming Events
Life is an endless series of events. The technology around us has made it a stream of digital actions emitting streams of data. If you turn back and investigate your life very carefully, you'll see the never-ending string of data you have generated with your every digital action. It could be a lot to digest at first, but let’s explore some scenarios and try to find what applies to you and me.
- Online banking and convenient e-commerce purchasing capabilities
- Ride-sharing, modern-day travelling and transportation
- Industrial equipment and agricultural use cases like monitored machinery, autonomous tractors and precision farming
- Automated power generation and smart grids, Zero-net Buildings, Smart metering
- Real-estate property recommendations based on geo-location, predictive maintenance
- Online dating and matchmaking relying on complex personality patterns and attribute distribution
- Financial trading according to the real-time changes in the stock market, analytical risk management
- Movies, songs and other digital media with a better experience depending on the demographics, preference, and emotions
- Improved web and mobile application experience based on usage
- Dynamic and personalised experiences in online gaming
- Enhanced social media experiences with hyper-personalisation and predictive analytics
- Telemetry from connected devices, or remote data centres from geospatial or spatial services like weather, resource assessment
- Sports analytics to enhance the players’ performance reducing health risks
All these events produce data—lots of it. Due to the frequency of this data emission, it has become an increasing burden to the digital space.
Streaming Data
In a survey conducted last year about data, it’s estimated that with the current pace of data generation,
1.7 MB of data will be created every second for every person on earth by 2020
Data that is poured continuously by a gazillion sources every second has become a fact we can’t just ignore. Big Data disciplinary was an eye-opener for the tech world to apply this once irritating data to do something useful. This same irksome data is collected and analysed by a new species, namely data scientists 😛. Due to the nature of continuity and often being in small sizes (order of Kilobytes) these data flows—usually referred by the moniker streaming data—are collected simultaneously as records and sent in for further processing.
Stream Processing and decision making
A streaming data processing structure usually comprises of two layers—a storage layer and a processing layer. The former is responsible for ordering large streams of records and facilitating persistence and accessibility at high speeds. The processing layer takes care of data consumption, executing computations, and notifying the storage layer to get rid of already processed records. Data processing is done for each record incrementally or by matching over sliding time windows. Processed data is then, subjected to streaming analytics operations and the derived information is used to make context-based decisions. For instance, companies can track public sentiment changes on the products by analysing social media streams continuously—world's most influential nations can intervene in decisive events like presidential elections in other powerful countries—mobile apps can offer personalised recommendations for products based on geo-location of devices, user emotions.
Most applications collect a portion of their data at the outset to produce simple summary reports and take simple decisions such as triggering alarms or calculating a moving average value. When the time flies by, these become more and more sophisticated, and companies might want to access profound insights to perform intricate activities in turn with the aid of Machine Learning algorithms and data analysis techniques. The continual growth of data has made data scientists work around the clock to come up with trailblazing solutions to utilise as much data as possible to fabricate alternate futures with better decisions.
How the world embraces it
Many companies use insights from stream analytics to enhance the visibility of their businesses which allows them to deliver customers a personalised experience. Additionally, near real-time transparency gives these firms the flexibility to promptly address emergencies. The emerging serverless architecture has driven all the leading cloud service platforms to present complementary solutions. Stream processing was made available for serverless application development with fully-managed, cloud-based services for real-time data processing over large Distributed Data Streams.
1. Entertainment got upgraded!
Netflix, the leading online television network in the world, developed a solution which centralises their flow logs using Amazon Kinesis Streams. As a system processing billions of traffic flows every day, this eliminates plenty of complexity for them because of the absence of a database in the architecture. Due to the high scalability and lightning speed, they can discover and address issues as they arise, monitor the application on a massive scale. With the upgraded recommendation algorithm, video transcoding, and licensing popular media, this subsequently grants a seamless experience to the subscribers. With the exponential growth of the subscribers, the company’s responsibilities increase by the day. However, nothing seems to be a problem for Netflix for many years to come since they are considered to have a sound decision-making model.
2. Improving the decisions of the decision makers
As a leading source of integrated and intelligent information for businesses and professionals, Thomson Reuters provide their services to decision makers in a wide range of domains like financing and risk, science, legal, technology. This company built an in-house analytics engine to take full control of data and moved to AWS because they were familiar with its capabilities and scale. The new real-time pipeline attached to Amazon Kinesis stream produces better results in perceptive customer experience with accurate economic forecasts, financial trends for beneficiaries including a range of government activities.
3. Unicorn: a solution to traffic congestion
Jakarta has become a heavily congested city where the motorcycle deemed the most efficient mode of transport. To exploit this business opportunity, GO-JEK—one of the few unicorn businesses in Southeast Asia—started as a call centre for motorcycle taxi bookings. However, to meet the demand in exceeding expectations, the company had to consider expansion. Now with the support of Google Cloud Professional Services, the business architecture built on Cloud Dataflow for stream inference enables them to predict changes in demand effectively.
There are more stories about how cloud platforms like AWS, Google, Microsoft Azure, and IBM Cloud are exploited by companies to make their clients’ lives better and secure.
Shortcomings of Serverless Stream Processing
Serverless stream processing is increasingly becoming a vital part of decision-making engines. However, with the current set of features, it’s not the ideal solution for some scenarios. Implementing real-time analytics for sliding windows and temporal event patterns is not a course for the faint-hearted.
The best way to assimilate never-ending data of this magnitude is through real-time dashboards which requires additional data organisation and persisting. These manoeuvres introduce undesirable latency and data management issues into the context. However, technology is evolving and trying to catch up to the speeds with integration using advanced cloud data management techniques to produce materialised views.
Stream Processing often uses a time-based or record-based window to be processed in contrast to the batch-based processing, which can lead to challenges in use cases that require query re-execution.
Nowadays, application requirements grow beyond aggregated analytics. Increasing the window size seems to be an appropriate temporary solution but, it develops another intractable problem—Memory Management. Modern-day solutions usually provide advanced memory management and scheduling techniques to overcome this, but the world will see further improvements.
Conclusion
All in all, it’s apparent that serverless stream processing has been playing a prominent role around us without us even knowing. With the power of serverless data stream processing, applications can evolve from traditional batch processing to real-time analytics. The revelation of profound insights will result in effective decision making without having to manage infrastructure. Even today, many organizations practise orthodox decision-making strategies based on the analytics derived using the big data clusters that belonged to THE PAST. New horizons of serverless and real-time data processing are now equipped with the power to make effective decisions and create a—more productive, relevant and most importantly secure—world around you.
So do you still think Trump is more powerful?
What do you think?
Share your thoughts. Kudos this.
Originally published at SLAppForge Blog.