Juniata College



Friday, February 28, 2020Name _____________________________Python data concepts. Short answer.[15 pts]For each of these Python data structures, describe their basic structure, purpose and inherent limitations, and if each structure is immutable or not.StructurePurposeLimitationsImmutable?Tuple ():List []:Dictionary {}:Unix. Give the command do the task as described.[15 pts]_________ list the file names in your current working directory._________ list the files and hidden files with all their details._________ change your working directory up one level_____________________ display the contents of the file small.txt_____________________ display the contents of the file huge.txt so that you can look at it screen by screenWhat key do you press to see the next screen? ______ What key do you press to go back one screen? ______ And how do you exit this command, especially if the file is huge? ______________________ show how many lines and characters there are in the file weather.csv.______________________ list all the lines in weather.csv with the word Fog in it.______________________ list the last 15 lines of huge.txtFor the transforming code below fill in the skeletal code to read a csv file and convert all empty cells (‘’) or ‘unknown’ entries to ‘NA’. Remove the first value of each line as it’s just a row id number. There are not meant to be errors in the code.[15 pts]import csvdef transform3(inFile, outFile): inf = open(inFile,"r", encoding="utf-8") outf = open(outFile,"_______", encoding="utf-8") csvinf = csv.reader(_________) csvout = csv.writer(_________) header = True for line in csvinf: if header: #fix header info headers= line headers.pop(_____) #row idnumber deleted csvout.writerow( __________ ) header = __________ _______ : ________ . pop ( _____ ) for i in range(len(______)) : # we have to index the list to make changes if str(line[i]) == ______ or str(line[i]) == _______________ : line[i] = ________ csvout.writerow(______) inf.close() outf.close()Python has a powerful string formatting capability. Fill in the chart explaining the meaning the various options available in this feature when “ {n: options} “.format(a,b,c,d,…..).[10 pts]Symbolmeaning{n}The n refers to: n.mn= and m=ss denotes formatting the item as ad,b,o,x or XThese are options to format the item as a ______________data typef or eThese are options to format the item as a ______________data type<Does what to the data in the field? ___________________>Does what to the data in the field? ___________________^Does what to the data in the field? ___________________Hadoop system concepts.[17 pts]Give a typical file size that is worthwhile placing into a Hadoop system. ____________[2]A file is broken into what typical size in Mb? ____________ [2]Why are file blocks replicated in HDFS? Give two reasons. [4]- -Give a brief explanation of mapping. [3]Give a brief explanation of reducing. [3]Between the mapping and reducing steps, there is one other operation. What is it doing? [3]XML and JSON files types. [9 pts]How are these file types similar? That is, what do they provide beyond a CSV? [3]What does self-describing mean in these file types? [3]How is XML more “wordy” than JSON? [3]SQL Select. Give a SQL query to generate the result set for the description [19 pts]Assume the schema of relations as we’ve used in class. Below is a reminder of the SQL syntax.Student (StuId, LastName, FirstName, major, credits)Faculty (FacId, Name, Dept, Rank)Class (ClassNum, FacId, Schedule, Room)Enroll (ClassNum, StuId, Grade)Department(DeptId, Name)Quick syntax for SQL, where [] means optional, {op1|op2|...} means choiceSELECT [DISTINCT] {* | attribute-list | aggregate functions}...FROM table {, table | NATURAL JOIN table | LEFT OUTER JOIN table {USING(attr) | ON condition}}*WHERE condition[GROUP BY attribute-list [HAVING condition]] [ORDER BY attribute]SQL conditions consist of <,>,<=,>=, <>,=, IS [NOT] NULL, AND, OR, BETWEEN value AND value, IN (SELECT…)Aggregate functions: COUNT(* | [DISTINCT] attr), MIN(attr), MAX(attr), SUM(attr), AVG(attr)Generate a table to list the CSC faculty (all attributes) sorted by name. [4]Assume each course is 3 credits.?Get a table of the number of courses each student, by name,?has earned credits from.?Give the computed column header a name. [5]Get a table of Math major’s student names and their courses taken. [5]Get a table of all faculty, their names and the number of courses they each teach (need to outer join and group by). [5] ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download