Hadoop became an essential componenet of the infrastructure of any company nowadays. There are different distributions maintained and managed by different companies like Cloudera, Databricks and AWS. The distribution managed by AWS is named EMR. This distribution is supposdly fully managed by AWS (Not everything [https://forums.aws.amazon.com/
Spark is actively supported by Apache Open Source community, and it is used in production by many famous firms and companies. In this blog, the focus would be on productionizing Apache Spark. I will discuss the use cases of Spark and how to enable each of them on production environment.
Apache Spark On Production (for Data Pipelines) This is the second post about Running Spark On Production, you can read the first post from here [https://ahmedkamal.fly.dev/using-spark-for-data-exploration/] In the first post, we talked briefly about spark and then discussed the data exploration use case and compared between
Hive is a data warehouse system built on top of hadoop for allowing querying and managing data sets. Who ? Hive was created by Facebook and is currently highly adopted by many firms including Netflix, Facebook and Bookings. Why ? Actually not everyone is fond of writing java programs for every problem
In my last post [http://ahmedkamal.me/redis-installation-configuration-and-usage/], I talked briefly about Redis and how to install it. In this post, I will try to go deeper and will introduce a very simple interface for using Redis in seconds from Java. At first I would like to introduce you to
Redis is a famous caching layer and in-memory database that is used in a lot of large-scale projects. Redis is used by [http://techstacks.io/tech/redis] Twitter GitHub, Pinterest, Snapchat, StackOverflow and Flickr. It supports data structures such as strings, hashes, lists, sets, sorted sets, bitmaps and geospatial indexes
Hi, in this blog post, I will try to give you some info about Hadoop and Microsoft distribution of Hadoop which is called HDInsight. Hadoop is one of the most famous No Sql and big data solutions. Hadoop is already used by big entities like Facebook , Twitter , yahoo and many