Apache Flink

Apache Flink
Developer(s) Apache Software Foundation
Stable release
1.1.3 / 13 October 2016
Development status Active
Written in Java and Scala
Operating system Cross-platform
License Apache License 2.0
Website https://flink.apache.org/

Apache Flink is a community-driven open source framework for distributed big data analytics, like Hadoop and Spark. The core of Apache Flink is a distributed streaming dataflow engine written in Java and Scala.[1][2] It aims to bridge the gap between MapReduce-like systems and shared-nothing parallel database systems. Therefore, Flink executes arbitrary dataflow programs in a data-parallel and pipelined manner.[3] Flink's pipelined runtime system enables the execution of bulk/batch and stream processing programs.[4][5] Furthermore, Flink's runtime supports the execution of iterative algorithms natively.[6]

Flink programs can be written in Java or Scala and are automatically compiled and optimized[7] into dataflow programs that are executed in a cluster or cloud environment.[8] Flink does not provide its own data storage system, input data must be stored in a distributed storage system like HDFS or HBase. For data stream processing, Flink consumes data from (reliable) message queues like Kafka.

Development

Apache Flink is developed under the Apache License 2.0[9] by the Apache Flink Community within the Apache Software Foundation. The project is driven by the Berlin-based start-up company dataArtisans, 20 committers, and more than 130 contributors.

Flink Forward

Flink Forward is a conference about Apache Flink. It took place for the first time on 12./13. October 2015 in Berlin. The conference featured 2 key notes, 32 talks from industry and academia, as well as practical training sessions on Flink. Flink Forward was repeated 2016 from 12-14 September in Berlin in extended form of 3 days, 43 speakers, and a Flink training similar to 2015.

History

In 2010, the research project "Stratosphere: Information Management on the Cloud"[10] (funded by the German Research Foundation (DFG)[11]) was started as a collaboration of Technical University Berlin, Humboldt-Universität zu Berlin, and Hasso-Plattner-Institut Potsdam. Flink is a Stratosphere fork and it became an Apache Incubator project in March 2014.[12] In December 2014, Flink was accepted as an Apache top-level project.[13][14][15][16]

Version Original release date Latest version Release date
Old version, no longer supported: 0.9 2015-06-24 0.9.1 2015-09-01
Old version, no longer supported: 0.10 2015-11-16 0.10.2 2016-02-11
Current stable version: 1.0 2016-03-08 1.0.3 2016-05-11
Current stable version: 1.1 2016-08-08 1.1.3 2016-10-13
Legend:
Old version
Older version, still supported
Latest version
Latest preview version
Future release

Release Dates

Apache Incubator Release Dates

Pre-Apache Stratosphere Release Dates

See also

References

  1. "Apache Flink: Scalable Batch and Stream Data Processing". apache.org.
  2. "apache/flink". GitHub.
  3. Alexander Alexandrov, Rico Bergmann, Stephan Ewen, Johann-Christoph Freytag, Fabian Hueske, Arvid Heise, Odej Kao, Marcus Leich, Ulf Leser, Volker Markl, Felix Naumann, Mathias Peters, Astrid Rheinländer, Matthias J. Sax, Sebastian Schelter, Mareike Höger, Kostas Tzoumas, and Daniel Warneke. 2014. The Stratosphere platform for big data analytics. The VLDB Journal 23, 6 (December 2014), 939-964. DOI
  4. Ian Pointer (7 May 2015). "Apache Flink: New Hadoop contender squares off against Spark". InfoWorld.
  5. "On Apache Flink. Interview with Volker Markl.". odbms.org.
  6. Stephan Ewen, Kostas Tzoumas, Moritz Kaufmann, and Volker Markl. 2012. Spinning fast iterative data flows. Proc. VLDB Endow. 5, 11 (July 2012), 1268-1279. DOI
  7. Fabian Hueske, Mathias Peters, Matthias J. Sax, Astrid Rheinländer, Rico Bergmann, Aljoscha Krettek, and Kostas Tzoumas. 2012. Opening the black boxes in data flow optimization. Proc. VLDB Endow. 5, 11 (July 2012), 1256-1267. DOI
  8. Daniel Warneke and Odej Kao. 2009. Nephele: efficient parallel data processing in the cloud. In Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS '09). ACM, New York, NY, USA, , Article 8 , 10 pages. DOI
  9. "ASF Git Repos - flink.git/blob - LICENSE". apache.org.
  10. "Stratosphere". stratosphere.eu.
  11. "DFG - Deutsche Forschungsgemeinschaft -". dfg.de.
  12. "Stratosphere". apache.org.
  13. "Project Details for Apache Flink". apache.org.
  14. "The Apache Software Foundation Announces Apache™ Flink™ as a Top-Level Project : The Apache Software Foundation Blog". apache.org.
  15. "Will the mysterious Apache Flink find a sweet spot in the enterprise?". siliconangle.com.
  16. http://www.heise.de/developer/meldung/Big-Data-Apache-Flink-wird-Top-Level-Projekt-2516177.html (in German)
  17. "Benchmarking Streaming Computation Engines: Storm, Flink and Spark Streaming" (PDF). IEEE. May 2016.

External links

This article is issued from Wikipedia - version of the 10/19/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.