Skip to content

MatthewAdragna/CS4433Proj3

Repository files navigation

IN ORDER TO GENERATE DATA:

  CD to the GeneratingCode folder:
  FOR CONCERT DATA:
        Run the generateConcert code by using
          py generateConcert.py
          with optional arguments
            --path "PUT YOUR PATH IN HERE" (default is ../input/)
            --covidchance inputChance (default is .01)
            --size inputSizeInMB (default is 10)
         


  FOR TRANSACTIONDATA:
    Run the generateTransactions code by using
      py generateTransactions.py
        with optional arguments
          --path "PUT YOUR PATH IN HERE" (default is ../input/)
          --customers int (default is 50000)
          --purchases int (default is 5000000)
          ```


  In order to run the queries:
      For Part 1:
          do 
              ~/spark/bin/spark-submit part1q1q2.py 
          and
              ~/spark/bin/spark-submit part1q3.py 
              
      For Part 2:
              ~/spark/bin/spark-submit part2.py
      

      Make sure that hadoop and spark are running.

About

WPI CS4433 Project 3

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published