• Home
  • Blog
  • Operating system threads and a thread-safe hashtable in c programming

Operating system threads and a thread-safe hashtable in c programming

0 comments

a classic problem in text processing is identifying the unique words. it doesn’t seem to hard: open a file, read it a word at a time, stick it in a hashtable, dump the hashtable when done. it gets a bit harder when we want to do it fast!

for this program we are going to use threads and a thread-safe hashtable. your program will take a list of files. it will create a thread that will process each file in parallel. (the in parallel part is important; we want to do this fast!) as each thread reads a word, it will look up the word in the hashtable and add it if it is not there. since threads will be accessing the table concurrently, we need to make it threadsafe. 29.4 of the book talks about concurrent hashtables. please use 256 buckets and make sure you do per bucket locking. (do not lock the whole hashtable.) you can use a tree or linked list to keep track of elements in a given bucket.

use fscanf to read the words of a file. “%ms” is a GNU extension to fscanf that will take care of allocating space for words that are read. for example,

char *ptr;
fscanf(fh, "%ms", &ptr);

will read in a word and allocate enough space to hold the word and assign ptr to the allocated space. you will need to make sure that ptr is free()’ed when you are done using it.

program specification

your program (uc.c) will take a list of filenames from the program arguments. you will output a single number: the total number of unique words (just a single number). the only other thing that your program will print out is a error (using perror) for each file that could not be opened.

(there is no limit to the number of filenames. you will certainly be tested with more than 5!)

example execution

$ ./uc 1100wordsof599
599
$ ./uc 1100wordsof599 400wordsof300
795
$ ./uc 1100wordsof599 400wordsof300 800wordsof797
1481
$ ./uc 400wordsof300 800wordsof797
992
$ ./uc 400wordsof300 nofile 800wordsof797 doesnotexist
nofile: No such file or directory
doesnotexist: No such file or directory
992
$ ./uc
0

example files

800wordsof797 400wordsof300 1100wordsof599

grading criteria

program compiles with no warnings (cc –std=c11 -Wall -pthread) 5
a thread is started for each file to be processed 10
threads run in parallel 10
threads access a shared hashtable that is thread-safe 10
hashtable has per-bucket locking 10

only the total the count of the total unique words found and any errors opening files are printed

10

no leaks (valgrind –leak-check=full –show-leak-kinds=all)

15

output correct for example files and arguments

10

output correct for testers files and arguments

10

logic clear and readable

10

About the Author

Follow me


{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}