Python: Read Write CSV

In this article, I will be showing how to read data from a CSV file and write data to a new CSV file using Python programming language.

Here’s a scenario: Suppose, I have a CSV file with 3 columns (user_id, item_id, and star_rating). It has data of 100 users. Each user has rated 20 different items. So, altogether there are 100 * 20 = 2000 entries in the CSV file.

Download: Input CSV File (CSV file which we will be working on)

Here’s how the CSV file looks like:

user_iditem_idstar_rating
1
31
3
1283
1204
1341

Now, I have a requirement to select single entry for each user, i.e. I have to select a total of 100 entries having 100 different users.

For this, I will first read the CSV file and create a Python dictionary having single entry for each user. Then, I will use that dictionary to write values to a new CSV file.

Here’s the full source code:

I am reading data from CSV file named dataset-recsys.csv and writing to a new CSV file named dataset-recsys-new.csv.


import csv
from pprint import pprint

dataset = {} # new dictionary
with open('dataset-recsys.csv') as myfile: # reading data from csv file
	reader = csv.DictReader(myfile, delimiter=',')	
	i = 0	
	for line in reader:			
		i += 1
		if (i == 1): # skip header
			continue	
		
		if (int(line['user_id']) not in dataset): # add user_id to dictionary
			dataset[int(line['user_id'])] = {}
		
		if (len(dataset[int(line['user_id'])]) == 1): # adding only one row for each user_id
			continue
						
		row = {'user_id': line['user_id'], 'item_id': line['item_id'], 'star_rating': line['star_rating']}
		dataset[int(line['user_id'])][int(line['item_id'])] = row		

print 'Reading Successful'
# pprint(dataset) # if you like to print the dictionary 

fieldnames = ['user_id', 'item_id', 'star_rating']
with open('dataset-recsys-new.csv', "a") as myfile: # writing data to new csv file
	writer = csv.DictWriter(myfile, delimiter = ',', fieldnames = fieldnames)
	writer.writeheader()
	star = 0	
	for key,val in dataset.iteritems():		
		for k,v in val.iteritems():
			writer.writerow(v)		

print 'Writing Successful'			

Download: Output CSV File (CSV file that is the output from above code)

Hope this helps.
Thanks.