What services do you offer for startups and small businesses?

I provide comprehensive software development services tailored for startups and small businesses, including custom application development, responsive website design, performance enhancements, and ongoing maintenance and support. My solutions focus on delivering exceptional user experiences while ensuring your digital presence effectively represents your business.

What technologies do you specialize in?

I specialize in modern technologies that deliver fast, reliable, and scalable solutions. My core tech stack includes Vue.js and Nuxt.js for frontend development, Laravel for backend systems, and TailwindCSS for responsive design. I also have experience with TypeScript, JavaScript, SQL/NoSQL databases, and containerization with Docker for deployment.

How do you approach performance optimization?

I focus on performance considerations throughout the development process. This includes implementing proper semantic HTML structure, ensuring mobile responsiveness, and using best practices for web standards. For performance, I optimize image delivery, implement efficient caching strategies, minimize JavaScript bundle sizes, and ensure fast page loading times to enhance user experience.

What is your typical process for working with clients?

My process begins with an in-depth consultation to understand your business goals and requirements. From there, I develop a detailed project scope, timeline, and cost estimate. During development, I maintain regular communication with progress updates and opportunities for feedback. After launch, I provide training on how to use your new software or application and offer ongoing support and maintenance packages to ensure your digital solution continues to perform optimally.

Do you offer maintenance and support after project completion?

Yes, I offer flexible maintenance and support packages tailored to your needs. These can include regular updates, security patches, performance monitoring, content updates, and technical support. Long-term partnerships with clients ensure their digital solutions continue to evolve with their business needs and stay current with the latest standards and best practices.

Go and Websockets

Diving into WebSockets with Go: A Journey of Learning and Rabbit Holes

Introduction

I wanted to learn a new language to add one to my toolbox and build my street cred; something that big boy developers use 👀 (it's not rust or C, but it's a step in the right direction...). This would be my first compiled language, and it had the promise of:

speed ✅,
productivity ✅
concurrency and parallelism are core to the language ✅.
cute mascot ✅.

I took the course, learned the syntax, wrote a few toy programs along the way, but I knew that unless I built something "real" otherwise, I wouldn’t retain much.

Then I had the bright idea that I also don't know anything about websockets. Of course, I used them extensively before but mostly ascribed their inner workings to magic 🪄.

They power real-time applications, but I had never looked under the hood to understand how they actually work. What better way to learn about it than building one as a first project with a language I barely know!

What followed was a roller-coaster of operating on bits, shoddy error handling, and AI-assisted learning.

Learning with AI: No Dumb Questions

One of the invaluable parts of this journey was having AI as my teacher. I could ask it questions without feeling dumb, get instant explanations, and have concepts broken down in different ways until they finally clicked.

Here's a couple of questions I asked besides the tsunami of questions I had about the WebSocket protocol:

What is a big-endian?

It's a way of storing or transmitting data. Big-endian means the most significant byte comes first. Little-endian means the least significant comes first. For example:

Big-endian	Little-endian
00010010 00110100, 01010110 01111000	01111000 01010110, 00110100 00010010

They both represent the same 16-bit number, but the order of the bytes is different.

In practice, it meant I had to rethink how I read multibyte values when parsing WebSocket frames.

What is a XOR binary operation?

If you have two bits, the XOR operation will return 1 if the bits are different and 0 if they are the same.

bit1	bit2	XOR-ed bit
1	1	0
0	0	0
1	0	1
0	1	1

This means the original bits can be restored by applying XOR again.

What are these 0x, 0b prefixes?

Hexadecimal (base 16) and binary (base 2) prefixes. 0x indicates a hexadecimal number, while 0b indicates a binary number. For example, 0xFF is 255 in decimal, and 0b11111111 is also 255 in decimal.

In practice, it meant I had to rethink how I read multibyte values when parsing WebSocket frames.

Bits, Bytes, and my barely functioning Brain

Since I never had formal computer science training, working with raw bits didn’t come naturally. WebSockets, unfortunately, require a lot of bitwise operations.

Some of the tricky concepts I ran into:

XOR Masking

WebSocket payloads from clients are XOR-masked with a key. Something WebSockets require:

to avoid misinterpretation by intermediaries and mistaking it for HTTP traffic
for obfuscation, mitigating cross-protocol attacks

Bitwise AND (&) and OR (|)

Used for (frame parsing). Just like XOR it's used to combine bits, but there are some creative uses. Take for example: byte2 & 0x80 != 0. This combines our byte with the hex 0x80 (which is 10000000 in binary) to check if the most significant bit is set. Note the number we're combining with; that is how e check which bit we're interested in.

Hexadecimal Conversions – Constantly switching between hex, decimal, and binary.

I had to look up conversions more times than I care to admit. But I did learn some things in the process, such as going from 16-bit to 8-bit means you lose the most significant bits. While this seems self-evident now, I would not have assumed this when I started.

Learning Go

I had little difficulty picking up the syntax given that I have experience in other languages; however, there are some parts that are still not sat well in my mind:

arrays are passed by value, while slices are passed by reference
structs are passed by value but if you passed around a pointer to the struct, that would be passed by reference

I also found it a little odd how interfaces are implemented implicitly, so I don't know right away by looking at my value. But thankfully, with modern IDEs, this is a non-issue. Another thing I must admit, is that I definitely need to do more with channels because I feel there are possible foot guns there I am still yet to discover.

Here's a further tidbit I came across: Go doesn't have enums. Coming from languages where there are some enums or resemblance of enums, I was surprised. No matter! It's not something I reach for often, and the following was a simple stand in for this working just fine.

type Opcode = int

const (
    Continuation = 0x0
    Text         = 0x1
    Binary       = 0x2
    Close        = 0x8
    Ping         = 0x9
    Pong         = 0xA
)

Building the WebSocket Server in Go

Given that this was a learning exercise, I only wanted to use the standard library. And I must admit, I am warming up to it. I was surprised by how much functionality is built-in and how powerful it is. For example, the net/http package handles requests in separate goroutines, Which is an instant win for performance.

I will not go into the nitty-gritty of the implementation as you can see it for yourself, but I will highlight some of the more interesting parts. My project isn't the full WebSocket protocol as defined in RFC 6455 but rather the MVP.

The Handshake

The first thing that happens when you would like to establish a websocket connection is a handshake. This is a run-of-the-mill HTTP request with some headers indicating you want to upgrade to a WebSocket connection. There is a complication here in the form of the Sec-WebSocket-Key header. It serves multiple purposes: prevent caching, tracking connections and ensures the server actually understands the assignment. It is also to prevent cross-protocol attacks(?), but I'm not going down that rabbit hole. This is generated by the client, and the server must hash it along with a pre-specified "magic string" before sending it back. If all goes well, the server sends back a rather succinct response of the following:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

This is where the crux of the logic comes in. So far we had nice *http.Request to work with, but now we will want to read straight from the connection so we have to 🏴‍☠️ hijack the http.ResponseWriter to get to the connection this was abstracting away

conn, _, err := w.(http.Hijacker).Hijack()

Reading the Connection

Next, we'll break out into a goroutine to handle the connection. (💥 blazingly fast! ) In short, we'll continuously read the connection and handle the result. However, the devil is in the ~~bits~~ details and reading the connection is a more involved process. This data we're getting is called a "frame", and it has a specific structure.

First, we read from the connection

// read only the first two bytes
header := make([]byte, 2)
_, err = io.ReadFull(conn, header)

And this will also move the pointer on the connection forwards, so when we next read it, we won't ready the same bytes (handy!). To read all these bits from these two bytes, we have to do a bit of bitwise gymnastics to get their values.

The first bit of the first byte determines if this is the final fragment (used with fragmented frames, we're not worried about those in this project). Then we have three (RSV) bits which are reserved for extensions/future use. The next four bits are mark the opcode (what kind of frame this is). Simple! We're already done with a byte!

Moving on, the first bit of the second byte is special in a similar fashion to the first. If 1 then the payload is masked (more on that later). The remaining seven bits are the payload length.

Payload Length

This is a bit of a "nightmare" to read because the value payload length value has different meanings.

If the value is less than 126, it is the true size of the payload in bytes.
If the value is 126, the next two bytes contain the size of the payload.
If the value is 127, the next eight bytes are the size of the payload.

Remember the big-endian from the Learning with AI: No Dumb Questions section? If it's either of these special cases, to get the true payload size, we have to combine the first by byte value with the next two or eight bytes.

// loop over the byte slice, multiple each byte by the position from the right most byte and add it to the length as an integer
for i := 0; i < 8; i++ {
    length |= int(payloadLength[i]) << (56 - i*8)
}

To clarify, first, we'll have the most significant byte's value, and we'll shift << it left by 56 bits (7 bytes) this creates seven sets of 0000 0000 bits creating a 64-bit number. The |= operator is a bitwise OR to combine the binary values of our integers. Same in the rest of the iterations except we shift left by eight bits less each time.

The Reveal

I promise we're close to reading our payload of "Hello, World!".

If our payload is masked, which it will be from a browser, we have to unmask it.

what if it's from a server? 🚨 rabbit hole alert 🚨

The masking key is the next four bytes after the payload size. With this we can unmask the payload by XOR-ing it with the masking key.

// loop over the bytes
for i := range payload {
    // XOR the payload:
    // at the index we combine the payload with the mask key at the corresponding index
    // modulo ensures that the mask key is repeated
    payload[i] ^= maskKey[i%4]
}

To recap, we apply the XOR operation to each byte with a rotating mask key (first byte with first mask, second with second, and so on). The modulo operator ensures that we cycle through the mask key as we read the payload.

Congratulations! You've successfully read a WebSocket frame. We now know it is a final frame, the type of frame, and we have the content. We can now handle it to our heart's content.

Sending a Message

This, I won't cover as this is the reverse of reading a message except for masking. The server is not expected to mask the payload. If you're still interested, you can check out the code at the link in the Final Notes section.

The Rabbit Holes

Here's a couple of topics that I followed up on until I had some understanding of them.

Malicious Fragmented Frames: The first couple bytes of a websocket frame also specifies the payload size. The maximum payload size is 2^63 bytes. Around ~18.3 exabytes (exabyte > petabyte > terabyte > gigabyte). What if a malicious client was capable of sending such an amount or more? Surely some infrastructure would fail/flag you or perhaps the server would fail to allocate enough memory. Turns out much more clever people have worked on such things than I am, and the OS handles this quite gracefully. When the client sends the data, the OS reads it into a buffer (a space managed in kernel, not accessible by applications) from the TCP stream. This buffer is managed by the OS using an ACK (acknowledge) TCP packet which has a header called window size, telling the client how much data it can send. Presumably, however, much space is left in the buffer. If the data sent isn't the specified size, the OS may just drop the packets or straight up close the connection. I'm certain there's more to this than my surface level knowledge (always is), but you can read more about the Transport Layer of the OSI Model if you're interested.

Pulling on Go's CGoroutine thread: One of the big draws of Go is its concurrency achieved with the developer-friendly goroutines. To understand goroutines, it's important to know how the OS handles the workload.

OS scheduler manages software threads on physical threads on physical cores in the CPU. A thread can be in one of three states:

runnable (ready to go, need some CPU time assigned),
waiting (waiting on network, IO-bound tasks, etc.),
or executing (the sweet spot).

When the OS scheduler assigns a thread to a core, it will run until it is either blocked or the time assigned to it is up. This is a rather complex algorithm, and it's called preemptive scheduling. Meaning we can't know when a thread is going to be interrupted (there are many factors here that are non-deterministic like events, networks, etc. and all this compounded by thread priorities).

In some ways, Go's scheduler mirrors the OS scheduler. It looks like a preemptive scheduler, and acts like a preemptive scheduler, but it isn't (fully) one. It's a cooperative scheduler, however, it will also use preemption on CPU-bound tasks after 10ms (read, hogging the CPU). This means that the go scheduler will switch context when the goroutine is blocked or when it yields (safe-points). But for all intents and purposes from a developer's perspective, think of it as a preemptive scheduler. To keep track of tasks, go has two type queues for scheduling goroutines waiting both being FIFO queues. A global run queue (GRQ) and a local run queue (LRQ). The global queue is shared between all the threads, while the local queue is specific to the thread. Go uses three parts to manage the scheduling of tasks:

G (Goroutine): Your actual task with its own stack and state (states just like the OS thread)
P (Logical Processor): Acts as a virtual CPU with its own local run queue
M (Machine): The actual OS thread that executes code

When you call go myFunc(), Go creates a G, puts it in a run queue (either global or local), and then one of the Ms will eventually pick it up to execute it. What's fascinating is that if a G blocks (like waiting on I/O or a channel), the M can detach from it, grab another P, and continue executing other Gs from the local queue. When the blocked G is ready again, it gets placed back in a run queue.

This elegant dance between Gs, Ms, and Ps is why Go can handle thousands of goroutines with just a few OS threads.

A Couple of other interesting things I learned while I was falling down this hole:

the hex numbers at the end of the stack trace are the instruction pointer's offset to the next instruction
I'm definitely out of my depth here
OS threads using the same cache can lead to false sharing, which is when two threads are working on the same cache line (jargon for bits of memory), causing them to invalidate each other's cache leading to performance issues.

Some further reading:

Final Notes

All in all, I found this to be an invaluable exercise and I think I have a base understanding of WebSockets now. As for Go, I think I will be using it more in the future and continue to challenge myself with it. This also gives me a newfound appreciation for how much work goes into open source projects.

Here are some of the things you should take away from this:

If you're in a similar position as I am/I was, I suggest you check out the project on GitHub. It's well commented and has more substance than this post.
Most of the time you're better off using well-established libraries instead of hand-rolling your own. Don't be that person.
Without AI, I would have spent hours wading through dense RFC documents. Instead, I got fast, to-the-point explanations and moved on. Given this isn't a novel subject, I see no issue using AI to help you learn.