PySpark

What is PySpark?

PySpark is a Spark Library which is written in Python. Using PySpark, we can run application on the cluster in parallel. 

Using PySpark, we can process the data from HDFS, AWS S3, RDBMS and many more sources and at the same time we can store the data as well on above-mentioned sources.

Here are the PySpark Tips and Tricks.


No comments:

Post a Comment