Specification for the synthesis code writing of Field Programmable Gate Array (FPGA)
FPGA Synthesizable Coding Guidelines: Rules That Actually Save You From Synthesis Failures
Writing RTL that simulates cleanly is easy. Writing RTL that synthesizes predictably, maps efficiently, and survives three generations of toolchain updates is a different skill entirely. Most synthesis failures do not come from complex algorithms. They come from small coding habits that look harmless in simulation but confuse the synthesis tool into generating garbage or rejecting the design outright.
This guide covers the coding rules that matter most when your target is an FPGA fabric, not a simulation waveform.
Clock Domain Crossing: The Number One Source of Synthesis Headaches
Every FPGA design has multiple clock domains. The moment you try to move a signal from one domain to another, you open the door to metastability, data corruption, and tools that refuse to analyze your timing. The fix is not complicated, but it requires discipline from the first line of code.
Use Synchronizers for Every Single-Bit Control Signal
A two-flop synchronizer is the minimum for any control signal crossing clock domains. Do not argue with this. Do not try to save one flip-flop. The first flop catches the metastable event, and the second flop gives it a full clock cycle to resolve before your logic sees it.
For signals that carry data, not just a toggle, use an asynchronous FIFO. The FIFO handles the clock domain crossing internally with dual-port memory, and your logic only sees clean, synchronized data on both sides. Writing your own CDC logic with handshakes works in theory. In practice, it fails when the handshake itself crosses domains, which it always does.
Never Gate Clocks Inside Your RTL
Clock gating is a power optimization technique, and it belongs in the back-end tools, not in your RTL. When you write assign clk_gated = clk & enable; and feed that into a flip-flop clock pin, the synthesis tool has to infer a clock mux. That mux sits in the global clock network, introduces skew, and can break timing closure on high-frequency designs.
If you need clock enables, use them on the data path, not on the clock itself. Write always @(posedge clk) if (enable) q <= d; instead. The synthesis tool will implement this as a clock enable on the flip-flop, which is exactly what the hardware supports natively. Same functional result, zero clock network risk.
Reset Strategy: Keep It Simple or Pay for It Later
Resets are where FPGA designs either behave predictably or turn into debugging nightmares. The goal is straightforward: every flip-flop should have a known state after reset, and the reset deassertion should not create glitches.
Synchronous Reset, Asynchronous Release
The most reliable pattern is an asynchronous reset with synchronous deassertion. The reset signal clears the flip-flop immediately, regardless of clock. But the release happens on the clock edge, which guarantees that the reset comes out clean and does not create a pulse shorter than one clock cycle.
Writing always @(posedge clk or posedge rst) with if (rst) q <= 0; else q <= d; gives you this behavior. The synthesis tool maps it to the flip-flop’s built-in async clear pin, which is exactly what you want. Do not use if (!rst) with an active-low reset unless your entire design uses active-low resets consistently. Mixing polarities in the same module is a recipe for subtle bugs.
Avoid Reset Chains That Span Multiple Modules
When reset fans out from a single source to dozens of modules, the arrival time varies. Some flip-flops come out of reset one clock cycle earlier than others. If your design depends on all modules being out of reset simultaneously, this skew will break it.
The practical fix: keep reset local to each module, and use a reset synchronization circuit at the module boundary. Each module has its own reset register that synchronizes the global reset into its local clock domain. This adds one cycle of latency but eliminates skew entirely. For most designs, that one cycle does not matter. What matters is that the design works every time you power it on.
Combinational Logic: What the Synthesis Tool Actually Sees
Synthesis tools do not understand your intent. They see equations. If you write combinational logic that has unintended latches or incomplete sensitivity lists, the tool will do exactly what you told it to do, not what you meant.
Always Use Fully Specified Case Statements for Multiplexers
When you write a case statement for a decoder or mux, cover every possible input combination. If you leave out a case, the synthesis tool infers a latch to hold the previous value. Latches in FPGA designs are almost always unintentional, and they cause timing analysis failures because the tool cannot predict when the latch will be transparent.
Use default branches explicitly. Write default: out = 0; even when you think every case is covered. The extra line costs nothing and prevents the tool from guessing.
Separate Combinational and Sequential Always Blocks
This is the single most effective rule for clean synthesizable code. Never mix combinational and sequential logic in the same always block. If you need a registered output with combinational input logic, use two always blocks: one combinational block that computes the next value, and one sequential block that registers it on the clock edge.
1always @(*) begin
2 next_state = current_state & enable;
3end
4
5always @(posedge clk) begin
6 if (rst) current_state <= 0;
7 else current_state <= next_state;
8end
9
This pattern is unambiguous. The synthesis tool knows exactly what is combinational and what is sequential. Mixing them creates priority logic that behaves differently in simulation versus hardware, and debugging that mismatch wastes days.
Arithmetic and Bit Manipulation: Avoiding Silent Overflows
FPGA arithmetic is fixed-width by nature. When you add two 8-bit numbers and the result needs 9 bits, the 9th bit disappears. No warning. No error. Just silence.
Size Your Operands Before the Operation
If you are adding two N-bit numbers and the result might overflow, extend both operands to N+1 bits before the addition. Do not rely on the synthesis tool to catch overflow. It will not. It will truncate the result to match the left-hand side assignment width, and your design will produce wrong answers without any indication that something went wrong.
For multiplication, the width grows even faster. Two 16-bit numbers multiplied produce a 32-bit result. If you assign that to a 16-bit register, you lose the upper 16 bits. Always check the bit width of your destination register against the bit width of the operation.
Use Explicit Bit Slicing Instead of Part-Select Tricks
When you need specific bits from a bus, use named ranges or explicit indices. Avoid relying on implicit truncation. Write result = data[15:8]; rather than result = data >> 8; when the upper bits do not matter, because the shift operator can behave differently depending on whether the operand is signed or unsigned.
For signed arithmetic, declare your signals as signed explicitly. The default in most HDLs is unsigned, and mixing signed and unsigned operands in the same expression produces results that are technically correct but almost never what you intended.
Coding for Synthesis: Patterns That Map Well to FPGA Fabric
Not all RTL patterns synthesize equally well. Some map to efficient LUT structures. Others force the tool to build wide multiplexers that eat routing resources and destroy timing.
Prefer One-Hot Encoding for State Machines in FPGAs
For state machines with up to about 10 states, one-hot encoding (one flip-flop per state) almost always outperforms binary encoding on FPGAs. The synthesis tool maps each state bit directly to a LUT input, which means the next-state logic becomes simple AND-OR trees with no wide muxes.
Binary encoding saves flip-flops but creates wide multiplexers for the next-state decode logic. On an FPGA, LUTs are abundant and flip-flops are cheap, so the trade-off favors one-hot. For state machines with 20 or more states, the flip-flop count starts to matter, and binary or gray encoding becomes worthwhile. Know where the crossover point is for your target device.
Avoid Deep Combinational Chains
Every LUT in an FPGA has a limited number of inputs. When your combinational logic spans more than 4 to 6 LUT levels, the synthesis tool has to stitch them together through the routing fabric, and timing closure becomes painful.
Break long combinational paths with pipeline registers. If a calculation takes 10 LUT levels, insert a register after level 5. The latency increases by one clock cycle, but the maximum clock frequency often doubles. This is the fundamental trade-off in FPGA design: pipeline for speed, or accept a lower clock and save registers. Almost always, pipeline wins.
Naming and Structure: Making Code Readable for the Tool and the Human
Synthesis tools do not care about your signal names. But your team does, and more importantly, the person debugging your design at 3 AM cares a lot.
Use Consistent Prefixes for Clock and Reset Signals
Mark every clock signal with clk_ and every reset with rst_. This is not just a style choice. When you scan through a module, you can instantly identify the timing-critical signals. It also prevents accidental use of a data signal as a clock enable, which is a bug that synthesis will not catch but simulation will, and the discrepancy between the two will drive you crazy.
Keep Module Sizes Under 500 Lines
Large modules are hard to verify, hard to debug, and hard for the synthesis tool to optimize. When a module grows beyond 500 lines, it is a sign that you should split it into smaller submodules. Each submodule should have a single responsibility: one state machine, one data path, one interface handler.
Smaller modules also mean faster compile times. Synthesis and place-and-route tools process modules independently, and a 2000-line monolith takes significantly longer to optimize than four 500-line modules that the tool can optimize in parallel.
What to Avoid: Coding Patterns That Cause Silent Failures
Some patterns look correct but create problems that only show up after the design is on hardware.
Do not use initial blocks for synthesis. They work in simulation but most synthesis tools ignore them or treat them as power-up values only. Relying on an initial block to set a register to a specific value means your design has undefined behavior until the first clock edge after configuration.
Do not use # delays in synthesizable code. Delay statements are simulation-only constructs. If you leave one in by accident, some tools will accept it and others will reject it. Either way, it does nothing in hardware.
Do not read from the same register you are writing in the same clock cycle without a pipeline stage. The synthesis tool will create a bypass mux, and the mux adds delay that can break your timing. If you need the new value in the same cycle, restructure the logic so the read happens one cycle later. It is cleaner and it works.
Do not connect multiple drivers to the same net unless it is a tri-state bus with a proper enable protocol. Multiple drivers on a single net create contention, and the synthesis tool will not warn you. The result is undefined hardware behavior that varies with temperature and voltage.
ChipApex is a global distributor of electronic components: ICs, semiconductors, passives & interconnects. Source active & obsolete parts with wholesale pricing, fast RFQ response, and worldwide delivery.Official website address:chipapex.com