Tag: Spark Streaming

Optimising performance of Spark Streaming applications on AWS EMR

January 17, 2018

by Neha Kaul, Senior Consultant in our Sydney team Altis recently delivered a real-time analytics platform using Apache Spark Streaming on AWS EMR with real-time data being streamed from AWS Kinesis Streams. In this blog post we provide insights into the key optimisation techniques that were applied to improve performance. These techniques include: Data partitioning

Spark on Windows 10

September 6, 2017

The third in a series of blogs from Anandraj Jagadeesan, talks us through downloading Apache Spark on Windows 10, using the new Ubuntu environment. About Spark Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It