Why is Go fast?

Why is Go fast?

Go has become popular for microprocesses & for scaling. What are the design decisions that make Go fast?

Summary:

1. Clever usage of the stack in preference to the heap

2. Lightweight Goroutines in a process avoiding OS calls to switch between threads & processes

Details:

1. The Stack

In the history of computer science, the fastest way to read memory is sequential. The stack is a consecutive memory block with fast & simple allocation/deallocation by moving stack pointers & using stack frames. The heap memory is used by pointers to chunks of memory. https://www.ardanlabs.com/blog/2017/05/language-mechanics-on-stacks-and-pointers.html

The heap memory is hard to manage. They have to be either managed by hand as in C/C++ or by a garbage collector that tracks data that is no longer pointed to. Garbage collectors are designed for high throughput (finding the most unused memory in a scan) or the more popular low latency (scan quickly). Go uses a low latency garbage collector. https://research.google/pubs/pub40801/

Sometimes, depending on the code, when the data can't be stored on the stack, the compiler will move the data to a heap, which is called 'escape analysis'. This can get complex, since if data that needs to be moved remains on the stack, this can cause memory corruption. The better the escape analysis, more data can remain on the stack & less data will need to move to the heap, which can improve performance. https://segment.com/blog/allocation-efficiency-in-high-performance-go-services/

Even though RAM is "Random Access memory", the fastest way to read memory is still sequential. Accessing random heap data via pointers even in RAM is two orders of magnitude slower than sequential stack data. https://www.forrestthewoods.com/blog/memory-bandwidth-napkin-math/

In Java, objects are stored on heaps; only its pointer is on the stack. Lists in Java, though they look linear & sequential, are stored as an array of pointers on the stack. The actual data is in the heap. Python, Ruby & Javascript have similar behaviors. Java has clever complex tunable garbage collectors, using both high throughput & low latency GCs in the JVM. The VMs for Python, Ruby & Javascript are less optimized than Java's.

In Go, Structs & primitive types use the Stack. Pointers are discouraged. The Garbage collector is optimized for low latency to return quickly. The design decision of favoring & encouraging the stack results in better performance.

2. Concurrency: Processes vs Threads vs Goroutines

Concurrency has to be used for the right use-cases. Concurrency is not parallelism & can increase code complexity. Amdahl's Law provides a formula to determine if concurrency is useful dependng on the nature of the sequential vs parallel work. Concurrency is best used for slow work such as I/O or network calls rather than most in-memory work. https://www.oreilly.com/library/view/the-art-of/9780596802424/

Traditionally, languages provide concurrency creating threads within a process through OS calls with locking to share data or through multiple processes. The OS schedules threads within a process on a CPU core.

A Go program creates multiple threads in a Go process on launch, paying the cost upfront, with its own scheduler. Go provides Goroutines, which can be thought of as lightweight processes or threads managed by Go rather than the OS. Creating a Goroutine & switching between Goroutines are quick since they happen within the Go process & don't make an OS call.

Go's scheduler is part of the Go process, making it quick compared to the OS scheduler, automatically balances the workload across the threads & works with the GC. The scheduler is optimized; for example, it can unschedule a goroutine when it is blocking on I/O. https://youtu.be/YHRO5WQGh0k

Goroutine stack sizes are smaller than OS thread stack sizes by default, consuming less memory by default (with an ability to grow as needed, paying a performance penalty at that time). Consequently, Go programs can spawn tens of thousands of simultaneous Goroutines, while a similar approach with native OS threading in other languages will slow to a crawl.

Go uses the CSP (Communicating Sequential Processes) concurrency model by default, using unbuffered & buffered channels to communicate data. This pattern enhances code clarity but is not a performance improvement; the performance improvement comes from the earlier design decisions.. Mutexes, locks & atomics are available, if needed.


3. Compiler

Compiler builds are large projects can be time consuming. The Go compiler does not support circular dependencies when building code. This adds development burden to organize the packages but results in quicker builds, compared to competing compilers that support circular dependencies.


Conclusion:

Some simple design decisions have made Go a performant language, resulting in high adoption & usage for scalable applications, despite its relative younger age.

Reference: The excellent book "Learning Go, an idiomatic approach to Real-World Go Programming" by Jon Bodner

要查看或添加评论,请登录

Swaminathan Saikumar的更多文章

  • Cloud native architecture-an overview

    Cloud native architecture-an overview

    Any infrastructure has two main components: compute & storage. Software needs compute to run & storage to read/write.

  • I Bonds during high inflation

    I Bonds during high inflation

    During times of high inflation in the USA, consider the Series I Bonds issued by the US treasury. Currently, I bonds…

  • Microservices deployment

    Microservices deployment

    History of deployment options: Physical machines: 1990s. Fast deployment, best performance.

  • Microservices security & tracking

    Microservices security & tracking

    Security: AAA: Authentication, Authorization, Accounting/Auditing Secure interprocess communication (TLS) Security…

    1 条评论
  • Isolation & Locks

    Isolation & Locks

    The CAP theorem states that two out three of Consistency, Availability & Partition Tolerance may be achieved. RDBMS…

  • Messaging architecture

    Messaging architecture

    Message formats: Text, such as JSON/XML. Readable & easier for debugging.

  • Microservices API Gateway

    Microservices API Gateway

    Benefits: Instead of specific services, clients talk to the API gateway, which provides a client-specific API…

  • SQL, NoSQL or Hadoop for 'Big Data'?

    SQL, NoSQL or Hadoop for 'Big Data'?

    In an earlier post, we had looked at how to use the 3Vs (Volume, Velocity & Variety) of data & the CAP theorem…

  • Choose relational vs NoSQL database?

    Choose relational vs NoSQL database?

    Relational or NoSQL? You have this great software application in mind. Now, to implement it, what database should you…

  • Scaling applications-an overview

    Scaling applications-an overview

    As a business grows, its software applications will need to scale. Computing bottleneck Run multiple identical…

社区洞察

其他会员也浏览了