Test your knowledge of load balancing strategies and scaling techniques in cloud architectures. This quiz covers key concepts, best practices, and real-world scenarios to help you understand efficient distribution of workloads and elastic scaling in cloud environments.
Which type of load balancing distributes client requests based on current server workload, such as CPU usage or memory utilization?
Explanation: Least connections load balancing directs incoming requests to the server with the fewest active connections, which often correlates with lighter resource usage. Round robin simply rotates requests in order and may ignore server load. Random selection assigns requests unpredictably, without considering workload. Hash-based approaches use hash values to direct traffic based on client or request attributes, not server load.
What is the primary benefit of horizontal scaling for web applications experiencing unpredictable traffic spikes?
Explanation: Horizontal scaling allows you to handle unpredictable or high traffic by adding more servers, providing better fault tolerance and increased capacity. Increasing server memory or improving CPU speed refers to vertical scaling, not horizontal. While adding servers can indirectly reduce latency by distributing the load, its main purpose is to increase overall capacity.
When a user must always connect to the same backend server during a session, which load balancing feature is essential?
Explanation: Session persistence ensures that a user's requests during a session are routed to the same server, which is critical when session information is stored locally. SSL termination deals with decrypting secure traffic, not routing. Caching improves response time but doesn't manage connections per user. DNS round robin distributes requests based on DNS lookup, which may break session continuity.
Which metric is most commonly monitored to automatically scale out compute resources in a cloud environment?
Explanation: CPU utilization is widely used as a key threshold for triggering autoscaling actions, as high CPU usage often indicates increased demand. Disk space is less directly related to immediate traffic spikes. Network traffic may be used but is generally less common than CPU. Server hostname is not a metric for scaling decisions.
Why are periodic health checks important in load balancing for cloud architectures?
Explanation: Periodic health checks help identify servers that are unresponsive, so the load balancer can stop sending requests to them. They do not encrypt data; that’s part of encryption protocols. DNS records are managed by DNS services, not health checks. Splitting databases is a database sharding procedure, not related to load balancer health checks.
A database server upgrades from 16 GB to 64 GB of RAM to handle more queries. What type of scaling does this illustrate?
Explanation: Vertical scaling increases the capabilities of a single server, such as adding more RAM or faster CPUs. Cloud bursting temporarily uses outside resources but isn't about upgrading a single server. Horizontal scaling adds more servers instead of strengthening one. Auto failover involves switching to backup components, not upgrading resources.
Which layer does content-based routing, such as directing image requests to specialized servers, primarily occur at in a load balancer?
Explanation: Layer 7 load balancing operates at the application layer, enabling routing based on content, such as URLs or headers. Layer 4 works with transport data, not application content. Layer 2 is data link, handling frames, while Layer 3 is the network layer, responsible for IP routing but not application content.
What is the main advantage of global load balancing for users accessing an application from different continents?
Explanation: Global load balancing directs user requests to the geographically closest regional server, which can lower latency and enhance performance for users in different regions. Lowering internal memory usage is unrelated. Faster local failover refers to handling failures within one site. Reducing instance count would hurt availability rather than help.
Why are stateless applications better suited for horizontal scaling in cloud architectures?
Explanation: Stateless applications handle each request independently, making it easy to distribute traffic across multiple instances. They do not require more CPU resources by default, nor do they store all data on a single server, which would hinder scaling. Sticky IP or session persistence is less critical for stateless applications; it's usually needed for stateful ones.
Before removing servers during a scale-down event, what step helps ensure that no in-progress user requests are lost?
Explanation: Session draining allows servers to finish processing existing requests before they are removed from the pool, preventing data loss or errors. Force shutdown terminates ongoing connections abruptly. Random decommissioning risks dropping active sessions. Static IP assignment pertains to network configuration, not safe resource reduction.