{"id":231,"date":"2016-10-04T10:05:04","date_gmt":"2016-10-04T17:05:04","guid":{"rendered":"http:\/\/www.riverlog.com\/wordpress\/?p=231"},"modified":"2016-10-04T10:19:37","modified_gmt":"2016-10-04T17:19:37","slug":"eight-things-in-the-design-of-apache-spark-hadoop-application","status":"publish","type":"post","link":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/","title":{"rendered":"Eight things in the design of Apache Spark hadoop ecosystem."},"content":{"rendered":"<p>There 8 things while designing an Apache Spark enabled application. Porting to a SPARK hadoop eco-system\u00a0is an important step that is dictated by the need for streaming capabilities and extreme speed of execution. Apache SPARK uses clustering algorithms and can be used with HDFS making it a composite architecture. Unless you understand the business process and the incoming data, it would be in-efficient to build such architecture. Remember, from bigdata volumes comes value and NOT traditional reports.<\/p>\n<p>1.SPARK relies on in-memory execution of tasks and storage. Because of this nature, it is important that you design your system having this thought in mind. Processes need to be built with this in view.<\/p>\n<p>2. These days, writing in Java could be more efficient from resource standpoint and from the point of view that Java does its own concurrency better. Just because you have several API&#8217;s built on SCALA it need not necessarily speed up your execution. Therefore it is worthwhile to think of writing in Java.<\/p>\n<p>3. SPARK architecture, may it be in the cloud or standalone, as it uses the in-memory space for data and executors, think about the heap size. Increasing heap sizes continuously to get it executed may reduce efficiency.<\/p>\n<p>4. Using User Memory is not recommended unless your architecture really demands it for some core extremely high speed streaming needs such as in the case of fraudulent activities where a huge segment is to be detected OR a failure of a system within your APPLICATION cluster.<\/p>\n<p>5.Take advantage of Unified Memory Management. Spark 1.6.x and above needed. This type of management appears to be using memory in a more dynamic way where the executor and data can push the limits if needed rather than a failure.<\/p>\n<p>6.Consider nodes as individual machines. This will help in your infrastructure planning because every Spark executor in an application has the same fixed number of cores and same fixed heap size.<\/p>\n<p>7.Before using Mesos, consider using hadoop\/yarn.<\/p>\n<p>8.Architecture is an art. So imagine, understand, absorb,design, travel through the design, re-design and architect, test small, test big, implement by deploying it in cloud;perhaps this is an ideal case and go live.<\/p>\n<p>Meet me at #DreamForce #df16 . Know how would it benefit you and how to fix the meeting at\u00a0<a class=\"twitter-timeline-link\" dir=\"ltr\" title=\"http:\/\/bit.ly\/2dau9xq\" href=\"https:\/\/t.co\/BpEbwgtCQa\" target=\"_blank\" rel=\"nofollow\" data-expanded-url=\"http:\/\/bit.ly\/2dau9xq\"><span class=\"js-display-url\">bit.ly\/2dau9xq<\/span><span class=\"tco-ellipsis\"><span class=\"invisible\">\u00a0<\/span><\/span><\/a>.<\/p>\n<p>Thank you.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>There 8 things while designing an Apache Spark enabled application. Porting to a SPARK hadoop eco-system\u00a0is an important step that &hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.7 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Eight things in the design of Apache Spark hadoop ecosystem. - AI, Generative AI, Cybersecurity, and Cloud: The Power Quartet Driving Enterprise Software Evolution<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Eight things in the design of Apache Spark hadoop ecosystem. - AI, Generative AI, Cybersecurity, and Cloud: The Power Quartet Driving Enterprise Software Evolution\" \/>\n<meta property=\"og:description\" content=\"There 8 things while designing an Apache Spark enabled application. Porting to a SPARK hadoop eco-system\u00a0is an important step that &hellip;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/\" \/>\n<meta property=\"og:site_name\" content=\"AI, Generative AI, Cybersecurity, and Cloud: The Power Quartet Driving Enterprise Software Evolution\" \/>\n<meta property=\"article:publisher\" content=\"http:\/\/www.facebook.com\/riverlog\" \/>\n<meta property=\"article:published_time\" content=\"2016-10-04T17:05:04+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2016-10-04T17:19:37+00:00\" \/>\n<meta name=\"author\" content=\"Sunny Menon\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@sunnymenon\" \/>\n<meta name=\"twitter:site\" content=\"@sunnymenon\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sunny Menon\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/\"},\"author\":{\"name\":\"Sunny Menon\",\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#\/schema\/person\/eb991291ee063bd60d69090852e2f5a0\"},\"headline\":\"Eight things in the design of Apache Spark hadoop ecosystem.\",\"datePublished\":\"2016-10-04T17:05:04+00:00\",\"dateModified\":\"2016-10-04T17:19:37+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/\"},\"wordCount\":393,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#organization\"},\"articleSection\":[\"Big Data BigData Cloud\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/\",\"url\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/\",\"name\":\"Eight things in the design of Apache Spark hadoop ecosystem. - AI, Generative AI, Cybersecurity, and Cloud: The Power Quartet Driving Enterprise Software Evolution\",\"isPartOf\":{\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#website\"},\"datePublished\":\"2016-10-04T17:05:04+00:00\",\"dateModified\":\"2016-10-04T17:19:37+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Eight things in the design of Apache Spark hadoop ecosystem.\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#website\",\"url\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/\",\"name\":\"The BigData,Artificial Intelligence, Enterprise Process Modernization, Startups and other day to day thinking.\",\"description\":\"Weblog By RiverLog\",\"publisher\":{\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#organization\",\"name\":\"RiverLog Software Consulting & Advisory Services\",\"url\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/wp-content\/riverlog_logo_Enhanced_square.png\",\"contentUrl\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/wp-content\/riverlog_logo_Enhanced_square.png\",\"width\":195,\"height\":195,\"caption\":\"RiverLog Software Consulting & Advisory Services\"},\"image\":{\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"http:\/\/www.facebook.com\/riverlog\",\"https:\/\/twitter.com\/sunnymenon\",\"https:\/\/www.linkedin.com\/in\/sunnymenon10\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#\/schema\/person\/eb991291ee063bd60d69090852e2f5a0\",\"name\":\"Sunny Menon\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/139eb39436cf90d010303dd411d67d0d?s=96&d=wavatar&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/139eb39436cf90d010303dd411d67d0d?s=96&d=wavatar&r=g\",\"caption\":\"Sunny Menon\"},\"description\":\"Sunny Menon is a software executive with large enterprise architecture background with over 20+ years of experience in the design, architecture, development of high volume enterprise applications. He has experience enabling cloud environment for enterprise applications. Designed and developed a bigdata products. He has helped #startups evolve from conceptual stages through definition of the actual product by aligning them with industry requirements, developing proof-of-concept and demonstrating the product thereby, helping in seeking funding from VCs. He has extensive experience in the integration of large enterprise applications, middle-ware and modernization of enterprise applications centered around SOA\/SaaS\/PaaS\/Cloud environments. At night, he enjoys 'staring' at the night skies and thinks, twinkle twinkle little star, how I STILL WONDER what you are.... He is a cruel poet who walks bare foot at times, to feel the beauty of the earth. Advisory Board Member and Investor : RiverLog Software Technical advisory to SOADevelopers.com Executive Board Member - Stencil Research Strategist : TechBeat Conference.com CISO - Paragon Security\",\"sameAs\":[\"http:\/\/riverlog.com\"],\"url\":\"https:\/\/riverlog.bigleafproductions.com\/wordpress\/author\/admin\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Eight things in the design of Apache Spark hadoop ecosystem. - AI, Generative AI, Cybersecurity, and Cloud: The Power Quartet Driving Enterprise Software Evolution","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/","og_locale":"en_US","og_type":"article","og_title":"Eight things in the design of Apache Spark hadoop ecosystem. - AI, Generative AI, Cybersecurity, and Cloud: The Power Quartet Driving Enterprise Software Evolution","og_description":"There 8 things while designing an Apache Spark enabled application. Porting to a SPARK hadoop eco-system\u00a0is an important step that &hellip;","og_url":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/","og_site_name":"AI, Generative AI, Cybersecurity, and Cloud: The Power Quartet Driving Enterprise Software Evolution","article_publisher":"http:\/\/www.facebook.com\/riverlog","article_published_time":"2016-10-04T17:05:04+00:00","article_modified_time":"2016-10-04T17:19:37+00:00","author":"Sunny Menon","twitter_card":"summary_large_image","twitter_creator":"@sunnymenon","twitter_site":"@sunnymenon","twitter_misc":{"Written by":"Sunny Menon","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/#article","isPartOf":{"@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/"},"author":{"name":"Sunny Menon","@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#\/schema\/person\/eb991291ee063bd60d69090852e2f5a0"},"headline":"Eight things in the design of Apache Spark hadoop ecosystem.","datePublished":"2016-10-04T17:05:04+00:00","dateModified":"2016-10-04T17:19:37+00:00","mainEntityOfPage":{"@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/"},"wordCount":393,"commentCount":0,"publisher":{"@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#organization"},"articleSection":["Big Data BigData Cloud"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/","url":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/","name":"Eight things in the design of Apache Spark hadoop ecosystem. - AI, Generative AI, Cybersecurity, and Cloud: The Power Quartet Driving Enterprise Software Evolution","isPartOf":{"@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#website"},"datePublished":"2016-10-04T17:05:04+00:00","dateModified":"2016-10-04T17:19:37+00:00","breadcrumb":{"@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/2016\/10\/eight-things-in-the-design-of-apache-spark-hadoop-application\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/"},{"@type":"ListItem","position":2,"name":"Eight things in the design of Apache Spark hadoop ecosystem."}]},{"@type":"WebSite","@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#website","url":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/","name":"The BigData,Artificial Intelligence, Enterprise Process Modernization, Startups and other day to day thinking.","description":"Weblog By RiverLog","publisher":{"@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#organization","name":"RiverLog Software Consulting & Advisory Services","url":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#\/schema\/logo\/image\/","url":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/wp-content\/riverlog_logo_Enhanced_square.png","contentUrl":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/wp-content\/riverlog_logo_Enhanced_square.png","width":195,"height":195,"caption":"RiverLog Software Consulting & Advisory Services"},"image":{"@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#\/schema\/logo\/image\/"},"sameAs":["http:\/\/www.facebook.com\/riverlog","https:\/\/twitter.com\/sunnymenon","https:\/\/www.linkedin.com\/in\/sunnymenon10\/"]},{"@type":"Person","@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#\/schema\/person\/eb991291ee063bd60d69090852e2f5a0","name":"Sunny Menon","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/139eb39436cf90d010303dd411d67d0d?s=96&d=wavatar&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/139eb39436cf90d010303dd411d67d0d?s=96&d=wavatar&r=g","caption":"Sunny Menon"},"description":"Sunny Menon is a software executive with large enterprise architecture background with over 20+ years of experience in the design, architecture, development of high volume enterprise applications. He has experience enabling cloud environment for enterprise applications. Designed and developed a bigdata products. He has helped #startups evolve from conceptual stages through definition of the actual product by aligning them with industry requirements, developing proof-of-concept and demonstrating the product thereby, helping in seeking funding from VCs. He has extensive experience in the integration of large enterprise applications, middle-ware and modernization of enterprise applications centered around SOA\/SaaS\/PaaS\/Cloud environments. At night, he enjoys 'staring' at the night skies and thinks, twinkle twinkle little star, how I STILL WONDER what you are.... He is a cruel poet who walks bare foot at times, to feel the beauty of the earth. Advisory Board Member and Investor : RiverLog Software Technical advisory to SOADevelopers.com Executive Board Member - Stencil Research Strategist : TechBeat Conference.com CISO - Paragon Security","sameAs":["http:\/\/riverlog.com"],"url":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/author\/admin\/"}]}},"_links":{"self":[{"href":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/wp-json\/wp\/v2\/posts\/231"}],"collection":[{"href":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/wp-json\/wp\/v2\/comments?post=231"}],"version-history":[{"count":7,"href":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/wp-json\/wp\/v2\/posts\/231\/revisions"}],"predecessor-version":[{"id":239,"href":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/wp-json\/wp\/v2\/posts\/231\/revisions\/239"}],"wp:attachment":[{"href":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/wp-json\/wp\/v2\/media?parent=231"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/wp-json\/wp\/v2\/categories?post=231"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/riverlog.bigleafproductions.com\/wordpress\/wp-json\/wp\/v2\/tags?post=231"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}