## The Effect of LUT and Cluster Size on Deep-Submicron FPGA Performance and Density

Elias Ahmed, Jonathan Rose

Year of publication: 2000 Area: Architecture



This seminal study considered both the performance and area impacts of organizing logic into hierarchies in FPGA architecture, namely LUT size (K) and cluster size (N). At the time this paper was written, academic FPGAs were just starting to consider clustered organization. Commercially, both Altera and Xilinx FPGAs used hierarchy but with different cluster sizes and interfaces. This paper was the first work to fully consider both the area and delay tradeoffs in context with the number of inputs to the cluster, and to do so in a clear scientific manner.

The key results of this paper were to relate the ideal number of input muxes (I) given the overall cluster-size (as a function of N,K) and to illustrate the optimal ranges of N and K for specific design goals combining area and delay. Remarkably, the conclusion that LUT-sizes of 5-6 were even better for area-delay than previously seen foreshadowed the change to 6-input LUT structures that we see in modern FPGAs.

The paper is notable not just for its conclusions, but for the clarity and persistence of the end-toend exploration methodology. Ahmed and Rose point out that the optimal values for these architectural results are dependent on current process parameters, that process parameters had changed the optimal point since previous studies, and outline a clear and reproducible framework for updating the work as the underlying technology continues to evolve.

Beyond the obvious influences of this paper, by now well-known to every FPGA architect, this work also set a standard for a complete (synthesis to routing) architectural evaluation as subsequent architectural studies adopted the ideal ranges set forth here for their follow-on studies in both architecture and CAD algorithms.

Mike Hutton