Exam 2
Fall 2015
Paul Krzyzanowski
- 6 points
(a) State one advantage of early binding over late binding.(b) State one advantage of late binding over early binding. - 5 points
How does virtual synchrony handle a fail-recover situation? That is, what happens when a failed node come back? - 6 points
Eric Brewer discusses consistency, availability, and partition tolerance in the CAP Theorem. Explain how it applies to traditional ACID semantics and the two-phase commit protocol. - 8 points
You have a collection of household information with each record containing information that includes:
{ name, age, address, zipcode, salary }
Use MapReduce to produce a list of zip codes where the average salaries are in the ranges: (1) under $100K, (2) $100K-500K, and (3) over $500K.
Explain the MapReduce operations needed to accomplish this. Use C/Java style pseudo code. Feel free to use
<=
and>=
comparisons for dates and assume that basic math operations and functions such as average() are available. Assume that the map function is called once for each parsed record of data and a function called emit() exists in the form emit(key, value). For reduce, assume that a print function exists that prints results; for example,print(name, result)
. You will need to use MapReduce twice, using the output of one run as the input of the second.Map_1 function map1(name, age, address, zip, salary) {
} Reduce_1 function reduce1(key, list) {
}
Map_2 function map1( ) {
} Reduce_1 function reduce1(key, list) {
}
Part II – 75 points – 3 points each
For each statement, select the most appropriate answer.
- Metcalfe's Law, named after the inventor of Ethernet, tells us that the value of a network increases as:
(a) The speed of the network increases.
(b) The cost of networking hardware decreases.
(c) The amount of data sent per person increases.
(d) The number of connected users increases. - Which of the following is an example of a pure name?
An email address (e.g., pxk@cs.rutgers.edu).
An Ethernet address (e.g., 00:1d:72:ae:cd:c6).
A file pathname (e.g., /usr/src/linux-headers/arch/arm/include/asm/syscall.h).
A URL (e.g., http://www.cs.rutgers.edu/~pxk/417/exam/index.html). - The DNS (Domain Name System) server for the EDU domain knows:
The list of names and owners of domains under .edu but not their IP addresses or name servers.
The IP addresses for all domains under .edu, including subdomains (e.g., cs.princeton.edu).
The IP addresses for all domains directly under .edu and name servers for subdomains.
The domain name servers for each domain directly under .edu. - The two-army problem demonstrates this about communication over faulty lines:
Reliable communication can never be achieved.
Reliable communication can be achieved but requires having the receiver send an acknowledgement.
Reliable communication can be achieved but may require several back-and-forth acknowledgements.
Reliable communication can be achieved only in one direction. - In virtual synchrony, a view change is defined as the event that takes place when:
The system hosting the group membership service changes.
A different process starts sending messages.
A message has been received by all group members.
The group membership changes. - In virtual synchrony, a message is considered stable:
When the sender promises not send any updates for that message.
Once it has been delivered to the application.
Once an incoming message has been fully received and validated to be error-free.
If it has been received by all group members. - In a Paxos system with P proposers and A acceptors, how many proposers need to be alive for the algorithm to work?
1.
⌈(P+1)/2⌉ (a majority of proposers).
⌈(A+1)/2⌉ (one proposer for a majority of acceptors)
A (one proposer for each acceptor). - In a Paxos system with P proposers and A acceptors, how many acceptors need to be alive for the algorithm to work?
1.
⌈(P+1)/2⌉ (one acceptor for a majority of proposers).
⌈(A+1)/2⌉ (a majority of acceptors).
P (one acceptor for each proposer). - Which is NOT a property of a transaction?
All-or-nothing: a transaction cannot complete partially.
Available: a transaction cannot keep other transactions from running.
Serializable: the result of concurrent transactions must be the same as if they ran in some serial order.
Permanent: once completed, the results of a transaction are made permanent. - The first thing that happens in the second phase of the two-phase commit protocol is (assume it will commit):
The participant responds to the commit query message received in phase 1.
The coordinator sends a commit message to all participants.
The coordinator tells all participants to free resources held by the transaction.
The coordinator asks the participants if they have completed the transaction, which started in phase 1. - The purpose of an additional phase in the three-phase commit protocol is to:
Get unanimous agreement from all participants on whether to commit or abort.
Tell all participants the outcome of the commit vote before telling them to commit or abort.
Have the coordinator receive acknowledgement of whether a commit or abort was successful.
Log the status of the commit or abort decision to a write-ahead log. - Differing from two-phase locking, in strict two-phase locking:
A transaction must promise to release every lock that it is granted.
A transaction must grab a lock for every resource that it needs to access.
A transaction must release all of its locks only at the end.
A transaction cannot acquire locks after it has released a lock. - The reason that strict two-phase locking was introduced as an enhancement of two-phase locking was to:
Increase the potential amount of concurrency possible.
Ensure that transaction executions are serialized.
Avoid cascading aborts when transactions read uncommitted data.
Achieve consistency by forcing transactions to get locks for any resources they need. - The edge chasing used in the Chandy-Misra-Haas algorithm is used to:
Prevent deadlock by ensuring that a transaction can only wait on resources held by an older transaction.
Construct a global wait-for graph to determine whether deadlock exists.
Detect whether there will be a circular dependency on waiting for resources.
Ensure deadlock cannot occur by aborting the transaction that is currently using the needed resource. - Deadlock will never occur if:
A resource can be held by at most one process.
Processes that hold resources are not allowed to wait for another resource.
A resource, once granted, cannot be taken away.
Two or more processes are waiting for resources held by one of the other processes. - A callback in AFS:
Enables a server to control bandwidth by telling a client when it can issue a request.
Informs a client when its requested operation has completed.
Informs a client that a requested file lock is available.
Informs a client that a cached file is now invalid. - Which file system is not implemented at the operating system level?
Network File System (NFS).
Andrew File System (AFS).
Microsoft SMB.
Google File System (GFS). - Coda?s client modification log enables:
Reintegration.
Read-write replication.
Consistent caching.
Disconnected operation. - Credit-based flow control in SMB:
Allows the server to control the load from each client.
Provides a billing system for clients that sue a server.
Improves the level of concurrency in file system operations.
Enables the client to cache more data. - Which file system was designed to support concurrent appends to a file?
NFS.
AFS.
SMB.
Coda. - An SMB oplock gives clients:
The possibility of caching file attributes.
The ability to lock a remote file.
The ability to check out a file for disconnected use.
The ability to synchronize local copies of remote files. - Which file systems place file metadata on separate servers from file data?
GFS and Chubby.
Dropbox and GFS.
Chubby, Dropbox, and GFS.
AFS and GFS. - Why did Dropbox add notification servers to their architecture?
To provide a mechanism to alert administrators when problems arise.
Dropbox servers needed to be notified when the file has been uploaded to Amazon servers.
To ensure that files are consistent among multiple clients.
To reduce load from clients polling the servers. - Chubby does not support:
Partial file reads.
Event notifications.
Whole file uploads.
File locking. - AS GFS master:
Identifies the addresses of all the name servers that keep track of file names and data.
Accepts every client request and routes it to the appropriate server.
Stores all the names in the file system along with the location of their data
Receives file write operations from clients that are then propagated to replicas. - How does Bigtable manage the growth of a table?
Individual table cells are distributed among a large set of servers.
An entire column of a table can be migrated to a different server.
Each new row is allocated to one server in a large set of servers based on a hash of its key.
A table is split along rows into sub-tables.