pyspark problem

0 comments

Please implement integer count program using Spark streaming. The input is CSV streaming data of floating-point numbers, and your program needs to round the input floating number to the nearest integers, and count the occurrences for each distinct integer. I’ll provide you the server program (server.py on Canvas) which generates lines of real numbers in CSV format (i.e., each line contains real numbers separated by commas) using socket (i.e., an IP address and a port number between 1024-65535). You can feed the input data using nc command to the problem. To help you view what the sever generates, I also provide you a client program (client.py). To understand how to run the server (similarly for client), just do:

$ python3 server.py -h

You need to run sever.py first before running the code.

About the Author

Follow me


{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}