CS417 Exam 3
Fall 2018
Paul Krzyzanowski
99 points - 3 points each. For each statement, select the most appropriate answer.
- In the Google cluster architecture, a search query is sent to a specific Google data center via:
(a) Hardware load balancers.
(b) Routing rules defined at each ISP.
(c) DNS-based load balancing.
(d) The address that is present in the URL. - A principle not employed in the Google search cluster is:
(a) Merging is efficient: divide the search among multiple systems.
(b) Use commodity PCs instead of highly-reliable computers.
(c) The price/performance ratio of a computer is more important than its peak performance.
(d) Design computers to minimize downtime by using fault-tolerant hardware. - If many map workers in MapReduce generate huge amounts of data that all share the same key:
(a) The (key, value) sets can be processed in parallel by multiple reduce workers.
(b) All (key, value) sets must be processed by a single reduce worker.
(c) It will result in an error since each map worker must generate data with unique keys.
(d) Multiple iterations of MapReduce may be run to process the data. - In MapReduce, a user's reduce function is called once for each:
(a) Map worker.
(b) Unique (key, value) pair.
(c) Reduce worker.
(d) Unique key. - The partitioning function in MapReduce:
(a) Determines which chunks of input data will be processed by which map workers.
(b) Determines which keys will be handled by which reduce workers.
(c) Dynamically load balances the ratio of map and reduce workers.
(d) Breaks up the available servers among the concurrent users running MapReduce jobs. - MapReduce's shuffle phase:
(a) Gets the needed data to each reduce worker.
(b) Randomly distributes key processing among reduce workers to ensure a balanced load.
(c) Breaks up large sets of data that share the same key among multiple reduce workers.
(d) Allocates any available additional map servers to speed up processing. - When a reduce worker dies in MapReduce, another one takes its place and:
(a) All the reduce workers have to restart.
(b) All the map workers have to restart to generate data for the reduce phase.
(c) Reads the same input data as the old one, with no change to any other workers.
(d) The MapReduce job restarts from the last checkpoint. - When can a reduce worker start executing its first reduce function?
(a) When every single map worker has completed its work.
(b) As soon as at least one map worker has completed all of its work.
(c) When at least one map worker generates its first (key, value) pair.
(d) It can start concurrently with the map workers. - As a Bigtable table gets more columns added to it:
(a) Columns are split into column families and may all be distributed among multiple servers.
(b) All columns within a tablet are still managed by one server.
(c) The operation will fail since all columns must be set up at the start.
(d) New columns may be placed on other servers while the old ones remain in place. - Bigtable's memtable is:
(a) A cache of frequently accessed table rows stored at the client.
(b) The set of recently committed changes to Bigtable stored on a tablet server.
(c) A cache of frequently accessed table rows stored on the Bigtable master.
(d) Queued client writes to Bigtable that didn't yet get sent to a tablet server. - Metadata tablets:
(a) Allow clients to find the tablet server responsible for a key.
(b) Store access control permissions for rows and columns of the table.
(c) Track disk and memory usage of the table.
(d) Balance the distribution of tablets among servers. - Bigtable does not allow a client to:
(a) Access old versions of data in a table cell.
(b) Iterate over all keys in a range.
(c) Add columns to a table.
(d) Perform atomic changes on multiple rows. - Spanner appears to violate Brewer's CAP theorem. In reality, it:
(a) Provides a CA model (consistency + availability) since partitions cannot occur.
(b) Provides an AP model (availability + partition tolerance) but sometimes gives up on consistency.
(c) Provides a CP model (consistency + partition tolerance) but sometimes loses availability.
(d) Uses the TrueTime API to provide all three: consistency, availability, and partition tolerance. - Spanner's TrueTime provides:
(a) A globally unique timestamp that reflects the real time of day.
(b) A way to synchronize clocks across multiple data centers.
(c) A range of time that includes the current time of day.
(d) A way to convert between time zones at different data centers so remote commit timestamps make sense.. - Commit wait in Spanner is used to:
(a) Delay a commit until other concurrent transactions have completed.
(b) Delay the start of a transaction until other transactions that access the same data have committed.
(c) Wait until all resources are released before committing.
(d) Have the commit order of transactions match time of day order. - Spanner can provide lock-free reads by:
(a) Allowing clients to read data up to a specific time in the past.
(b) Employing optimistic concurrency control techniques.
(c) Using commit wait operations in a transaction to give the illusion of concurrency.
(d) Using strong strict two-phase locking (SS2PL). - A superstep in BSP (bulk synchronous parallel) is:
(a) A sequence of computation steps that terminate in writing a checkpoint file.
(b) A set of concurrent operations that terminate in a barrier.
(c) The set of operations performed on behalf of one vertex.
(d) The period of time between computations during which messages are sent and received. - When a worker dies in Pregel, another worker takes over and:
(a) Computation restarts on the vertices that the dead worker was responsible for since the last superstep.
(b) All workers except those that already voted to halt must restart their computation from the last checkpoint.
(c) All workers must restart their computation from the last checkpoint.
(d) Computation restarts on the vertices that the dead worker was responsible for since the last checkpoint. - Combiners in Pregel are used to:
(a) Combine multiple vertices into one vertex.
(b) Combine multiple edges into one edge.
(c) Form a disjoint union of two or more graphs, creating one graph.
(d) Optimize message delivery by combining multiple messages sent to a vertex. - A Pregel job terminates when:
(a) Each vertex votes to halt and no vertex receives a message.
(b) All vertices vote to halt.
(c) No vertices sent any messages to any other vertices.
(d) At least one vertex requests to halt. - A difference between Spark transformations and actions is that a transformation:
(a) Modifies an existing RDD while an action creates a new one.
(b) Only rearranges an RDD while an action performs computation on the data.
(c) Produces a new RDD while an action reads but does not generate an RDD.
(d) Is used only on original data while actions can be pipelined together on those results. - Multihoming refers to:
(a) Connecting a computer to multiple networks.
(b) Replicating content across multiple computers.
(c) Sharding content across multiple computers, so each has only a part of the content.
(d) Caching frequently-used content at caching servers or proxies. - An Akamai CDN load balances incoming requests by using:?
(a) A hardware load balancer at each data center.
(b) Its own DNS (Domain Name System) servers that may return different addresses for the same domain name.
(c) HTTP redirection to allow a server to redirect a request to a more suitable server.
(d) Multiple computers configured to share the same IP address. - Quorum in a cluster means:
(a) Having a collection of servers vote on which node runs which services.
(b) Finding out if there are enough live nodes to continue running the cluster.
(c) Propagating the state of the cluster configuration to all nodes in the system.
(d) Determining how failover of a service takes place if the node running it dies. - A cluster file system differs from a network file system in that:
(a) It is designed to be fault tolerant by replicating content across multiple servers.
(b) It is designed to by dynamically scalable by adding more storage servers.
(c) Systems access storage at a block level instead of a file system level.
(d) It allows multiple systems to access files concurrently. - The technique of fencing in a cluster refers to:
(a) Disabling or isolating a faulty component in the cluster.
(b) Establishing a perimeter of protection so outside systems cannot communicate with members of the cluster.
(c) Separating storage nodes from computing nodes.
(d) Tracking which nodes in the cluster are alive. - For Bob to send Alice a message that only Alice can read, he would encrypt it with:
(a) Bob's private key.
(b) Bob's public key.
(c) Alice's private key.
(d) Alice's public key. - If it takes one day to try all combinations of a 60-bit symmetric key, how long would it take to crack a 65-bit key?
(a) 5 days.
(b) Approximately one month.
(c) Approximately one year.
(d) Approximately 32 years. - The Diffie-Hellman algorithm is designed to:
(a) Use a trusted third party that will distribute a session key securely to both parties.
(b) Use public key cryptography to communicate among two parties.
(c) Create a session key that can be shared with any number of participants.
(d) Allow two parties to come up with a shared key that nobody else can derive. - A digital signature differs from a message authentication code because:
(a) It is encrypted.
(b) It does not rely on hash functions or checksums.
(c) It requires the use of public key cryptography.
(d) It identifies the user who generated the signature. - The Challenge Handshake Authentication Protocol has this advantage over the Password Authentication Protocol:
(a) An encrypted communication channel is used to disallow sniffing the network.
(b) The password is never sent across the network.
(c) It allows the client to authenticate the server.
(d) It results in sending fewer messages over the network. - A digital certificate stores:
(a) Message authentication codes for a web site.
(b) Digital signatures of web pages for a site.
(c) A user's public key.
(d) A user's private key. - SSL uses a hybrid cryptosystem, which means:
(a) Public key cryptography is used to pass a symmetric session key.
(b) A different key is used to encrypt data flowing in each direction.
(c) Multiple layers of encryption are used to provide increased security.
(d) Every message has a Message Authentication Code attached to it.