# General Reading List

The below set of papers represent a starting point for people new to area.

#### 1963
**[Parallel Processing in a Restructurable Computer System](https://dx.doi.org/10.1109/PGEC.1963.263558)**  
Gerald Estrin, B. Bussell, R. Turn, and J. Bibb  
IEEE Transactions on Electronic Computers, Volume 12, Issue 6, pp. 747--754, December, 1963  
*Early predecessor to reconfigurable computers; this was before integrated circuits and their ``configurations'' required physically moving wires, but the goal was the same. Modern FPGAs make this vision practical.*

#### 1982
**[The Yorktown Simulation Engine](https://dl.acm.org/doi/10.5555/800263.809186)**  
Monty Denneau
Proceedings of the 19th Design Automation Conference, p. 55--59, 1982  
*This was a pre-FPGA logic simulation engine that was also used to simulate logic before building hardware. It includes most of the ideas behind multicontext FPGAs.*

#### 1986
**A User Programmable Reconfigurable Logic Array**  
William S. Carter, Khue Duong, Ross H. Freeman, Hung-Cheng Hsieh, Jason Y. Ja, John E. Mahoney, Luan T. Ngo, and Shelly L. Sze  
Proceedings of the IEEE Custom Integrated Circuits Conference, pp. 233--235, May 1986  
*First peer-review, public description of a commercial FPGA.*

#### 1990
**[Architecture of Field-Programmable Gate Arrays: The Effect of Logic Block Functionality on Area Efficiency](https://dx.doi.org/10.1109/4.62145)**  
Jonathan Rose and Robert Francis and David Lewis and Paul Chow  
IEEE Journal of Solid-State Circuits, Volume 25, Number 5, pp. 1217--1225, October, 1990  
*Why did we start with 4-LUT FPGAS? But more than that, this is beautiful example of formulating a clean question about architecture, defining a parameterized space, identifying a cost model, and explorting the space to find the best option.*

#### 1991
**[Building and Using a Highly Programmable Logic Array](https://dx.doi.org/10.1109/2.67197)**  
Maya Gokhale, William Holmes, Andrew Kopser, Sara Lucas, Ronald Minnich, Douglas Sweely, and Daniel Lopresti
IEEE Computer, Volume 24, Number 1, pp. 81--89, 1991  
*One of the early FPGA Computing systems that demonstrated performance exceeding supercomputers on a specialized problem (DNA Sequence matching) using a board of FPGAs. The entire capacity of one of these boards is smaller than today's midrange FPGAs*

**[Compiling Occam into FPGAs](https://www.doc.ic.ac.uk/~wl/papers/fpl91a.pdf)**  
Ian Page and Wayne Luk
FPGAs, pp. 271--283, Abingdon EE&CS; Books, 1991  
*Describes methodology to compile from a high-level language to FPGA, a precursor to Handel-C.*

#### 1992
**[A Reconfigurable Multiprocessor IC for Rapid Prototyping of Algorithmic-Specific High-Speed DSP Data Paths](https://dx.doi.org/10.1109/4.173120)**  
Dev C. Chen and Jan M. Rabaey  
IEEE Journal of Solid-State Circuits, Volume 27, Number 12, pp. 1895--1904, December, 1992  
*Early coarse-grained reconfigurable device initially intended for rapid prototyping of DSP algorithms; PADDI has 16b functional units and 8 configuration contexts which operate in VLIW fashion.*

#### 1993
**[Virtual Wires: Overcoming Pin Limitations in FPGA-based Logic Emulators](https://web.archive.org/web/20230531224232/http://dx.doi.org/10.1109/FPGA.1993.279469)**  
Jonathan Babb, Russell Tessier, and Anant Agarwal  
Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines, pp. 142--151, April, 1993  
*Time-multiplexing the FPGA I/O to better balance I/O bandwith with internal FPGA capacity in multi-FPGA systems.*

#### 1994
**[FlowMap: An Optimal Technology Mapping Algorithm for Delay Optimization in Lookup-Table Based FPGA Designs](https://dx.doi.org/10.1109/43.273754)**  
Jason Cong and Yuzheng Ding  
IEEE Transactions on Computer-Aided Design, Volume 13, Issue 1, pp. 1--12, January, 1994  
*How to cover logic into LUTs; nice observation that the problem can be reframed from logic packing to IO cuts. Use of dynamic programming and max flow is algorithmically elegant. There are a wealth of improvements and more sophisticated versions since this, but it's worth starting here for the cleanness of this basic problem formulation.**

#### 1995
**[PathFinder: A Negotiation-Based Performance-Driven Router for FPGAs](https://doi.acm.org/10.1145/201310.201328)**  
Larry McMurchie and Carl Ebeling  
Proceedings of the International Symposium on Field-Programmable Gate Arrays, pp. 111--117, 1995  
*The basic routing algorithm around which virtually all FPGA routing is built today. Dismisses with separate global/detail phases and uses adaptive costs to sort out congestion.*

**[Teramac---Configurable Custom Computing](https://dx.doi.org/10.1109/FPGA.1995.477406)**  
Rick Amerson, Richard Carter, W. Bruce Culbertson, Phil Kuekes, and Greg Snider  
Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines, pp. 32--38, April, 1995  
*A reconfigurable system based on a custom FPGA-like design aimed at rapid application mapping.*

**[Video Communications using Rapidly Reconfigurable Hardware](https://dx.doi.org/10.1109/76.475899)**  
John Villasenor, Chris Jones, and Brian Schoner  
IEEE Transactions on Circuits and Systems for Video Technology, Volume 5, Number 6, pp. 565--567, December 1995  
*Early article articulating and demonstrating the idea of using rapid Run-Time Reconfiguration in order to run large tasks on smaller FPGA systems.*

#### 1996

**[FPGA and CPLD Architectures: A Tutorial](https://dx.doi.org/10.1109/54.500200)**  
Stephen Brown and Jonathan Rose  
IEEE Design and Test of Computers, Volume 13, Number 2, pp. 42--57, 1996  
*An approachable tutorial for a general audience on FPGA and CPLD architectures.*

**[DPGA Utilization and Application](https://doi.acm.org/10.1145/1145/228370.228387)**  
André DeHon  
Proceedings of the International Symposium on Field-Programmable Gate Arrays, pp. 115--121, February, 1996  
*What would you do with a multicontext FPGA and what benefits does it offer?*

**[MATRIX: A Reconfigurable Computing Architecture with Configurable Instruction Distribution and Deployable Resources](https://dx.doi.org/10.1109/FPGA.1996.564808)**  
Ethan Mirsky and André DeHon  
Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, pp. 157--166, April, 1996  
*Early coarse-grained reconfigurable architecture that allows flexible organization of units and instruction distribution. The basic element is a composable 8b funcntional unit with a 256 byte memory/register file that can also be used to hold dynamic instructions.*

**[RaPiD---Reconfigurable Pipelined Datapath](https://dx.doi.org/10.1007/3-540-61730-2_13)**  
Carl Ebeling and Darren Cronquist and Paul Franklin
Proceedings of the International Conference on Field-Programmable Logic and Applications (published as LNCS-1142), pp. 126--135, Springer, 1997  
*Early coarse-grained, domain-specific reconfigurable architecture. RaPiD has 16b functional units arranged in a 1D linear array.*

**[Programmable Active Memories: Reconfigurable Systems Come of Age](https://dx.doi.org/10.1109/92.486081)**  
Jean E. Vuillemin, Patrice Bertin, Didier Roncin, Mark Shand, Hervé Touati, and Philippe Boucard  
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Volume 4, Number 1, pp. 56--69, March, 1996  
*One of the earliest FPGA computing systems. PAM demonstrated impressive performance from a board of FPGAs on a range of applications; the entire capacity of a PAM board is smaller than today's midrange FPGAs.*

#### 1997

**[Signal Processing at 250 MHz using High-Performance FPGAs](https://doi.acm.org/10.1145/258305.258313)**  
Brian Von Herzen  
Proceedings of the International Symposium on Field-Programmable Gate Arrays, pp. 62--68, 1997  
*Early and inspiring demonstration that FPGAs can operate productively at very high clock rates by paying careful attention to spatial locality and pipelining.*

**[Reconfigurable Computing: The Solution to Low Power Programmable DSP](https://dx.doi.org/10.1109/ICASSP.1997.599622)**  
Jan Rabaey  
Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, Volume 1, pp. 275--278, April, 1997  
*Early paper making the case for the energy efficiency of reconfigurable architectures and including an early comparison of energy among processors, FPGAs, and ASICs.*

**[A Time-Multiplexed FPGA](https://dx.doi.org/10.1109/FPGA.1997.624601)**  
Steve Trimberger, Dean Carberry, Anders Johnson and Jennifer Wong  
Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, pp. 22--28, April, 1997  
*How to add multicontext support to a mostly conventional FPGA architecture base.*

**[Defect Tolerance on the TERAMAC Custom Computer](https://dx.doi.org/10.1109/FPGA.1997.624611)**  
W. Bruce Culbertson, Rick Amerson, Richard Carter, Phil Kuekes, and Greg Snider
Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, pp. 116--123, April, 1997  
*Shows how reconfigurability of the FPGA can be used to to map around defects in the fabricated IC or board-level system. An early paper giving a full-system demonstration of the benefits of component-specific mapping.*

**[VPR: A New Packing, Placement, and Routing Tool for FPGA Research](https://doi.org/10.1007/3-540-63465-7_226)**  
Vaughn Betz and Jonathan Rose  
Proceedings of the International Conference on Field-Programmable Logic and Applications (published as LNCS-1304), pp. 213--222, Springer, 1997  
*A good placer coupled with a good version of Pathfinder and targeted at Island-style FPGAs. The free availability of this high-quality tool has provided a baseline standard for FPGA architectural work for over a decade.*

#### 1998

**[A New Retiming-based Technology Mapping Algorithm for LUT-based FPGAs](https://doi.acm.org/10.1145/275107.275118)**  
Peichen Pan and Chih-Chang Lin  
Proceedings of the International Symposium on Field-Programmable Gate Arrays, pp. 35--42, February, 1998  
*Optimally solve LUT mapping and retiming simultaneously; there are so few things we can solve optimally, and so few things we can afford to address together, it's refreshing to see formulations where you can provide optimal results across multiple traditional levels of decomposition. As with flowmap, there are later papers which take this further and provide more efficient and general solutions, but the earlier papers introduce the cleanest problems and key ideas.*

**[How Much Logic Should Go in an FPGA Logic Block?](https://dx.doi.org/10.1109/54.655177)**  
Vaughn Betz and Jonathan Rose  
IEEE Design and Test of Computers, Volume 15, Number 1, pp. 10--15, 1998  
*A paper explaining the move to ``Island-Style'' FPGAs. Why do we use clusters with multiple LUTs?*

#### 1999

**[Architecture and CAD for Deep-Submicron FPGAs](https://doi.org/10.1007/978-1-4615-5145-4)**  
Vaughn Betz, Jonathan Rose, and Alexander Marquardt  
Kluwer Academic Publishers, 1999  
*Classic book on FPGA architecture and CAD. Describes VPR and island style FPGAs. While the technology is dated, this book provides the best single introduction to FPGA organization and implementation issues as well as a description of the popular clustering, placement, and routing algorithms using for physical mapping of designs to FPGAs.*

**[Balancing Interconnect and Computation in Reconfigurable Computing Array (or, why you don't really want 100% LUT utilization)](https://doi.acm.org/10.1145/296399.296431)**  
André DeHon  
Proceedings of the International Symposium on Field-Programmable Gate Arrays, pp. 69--78, February, 1999  
*Since interconnect is the dominant area (and delay and energy) contributor on FPGAs, architectural optimizations which try to provide adequate interconnec to use all the logic may quite inefficient; this paper turns the question around and asks how the two should be balanced together. This provides a clean, parameterized formulation of this tradeoff.*

**[FPGA Routing Architecture: Segmentation and Buffering to Optimize Speed and Density](https://doi.acm.org/10.1145/296399.296428)**  
Vaughn Betz and Jonathan Rose  
Proceedings of the International Symposium on Field-Programmable Gate Arrays, pp. 59--68, February, 1999  
*Why FPGA tracks are segmented, and details on the tradeoffs involved.*

**[HSRA: High-Speed, Hierarchical Synchronous Reconfigurable Array](https://doi.acm.org/10.1145/296399.296442)**  
William Tsu, Kip Macy, Atul Joshi, Randy Huang, Norman Walker, Tony Tung, Omid Rowhani, Varghese George, John Wawrzynek, and André DeHon  
Proceedings of the International Symposium on Field-Programmable Gate Arrays, pp. 125--134--78, February, 1999  
*Why should an FPGA run slower than a processor? Shows how adding pipelining to interconnect allows tools to target high-throughput operations.*

#### 2000

**[The Density Advantage of Configurable Computing](https://dx.doi.org/10.1109/2.839320)**  
André DeHon  
IEEE Computer, Volume 33, Number 4, pp. 41--49, 2000  
*Broad-audience article comparing FPGA, processor, and custom logic denstities for accelerating computing applications.*

**[The Garp Architecture and C Compiler](https://dx.doi.org/10.1109/2.839323)**  
Timothy Callahan and John Hauser and John Wawrzynek  
IEEE Computer, Volume 33, Number 4, pp. 62--69, 2000  
*Details one of the earliest architectures for using a reconfigurable array as a coprocessor attached to a microprocessor, including a compiler capable of automatically extracting application kernels for execution on the reconfigurable array.*

**[PipeRench: a reconfigurable architecture and compiler](https://dx.doi.org/10.1109/2.839324)**  
Seth C. Goldstein, Herman Schmit, Mihai Budiu, Srihari Cadambi, Matthew Moe, and R. Reed Taylor  
IEEE Computer, Volume 33, Number 4, pp. 70--77, 2000  
*Coarse-grained reconfigurable with a virtual pipeline model that allows hardware scaling and fast application mapping.*

**Building a RISC System in an FPGA**  
Jan Gray  
In Circuit Cellar Ink, Number 116--118, March, April, May, 2000  
*Tutorial on building custom processors optimized for implementation on FPGAs*

#### 2001

**[Pilchard—A Reconfigurable Computing Platform with Memory Slot Interface](https://dx.doi.org/10.1109/FCCM.2001.36)**  
P. H. W. Leong, M. P. Leong, O. Y. H. Cheung, T. Tung, C. M. Kwok, M. Y. Wong, and K. H. Lee  
Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 170--179, April, 2001  
*A reconfigurable computing platform with a memory slot interface to improve transfer latency. A similar approach is used in DRC machines.*

#### 2002

**[Reconfigurable Computing: a Survey of Systems and Software](https://doi.acm.org/10.1145/508352.508353)**  
Katherine Compton and Scott Hauck  
ACM Computing Surveys, Volume 34, Number 2, pp. 171---210, 2002  
*An excellent survey paper on reconfigurable computing.*

#### 2004

**[FPGAs vs. CPUs: Trends in Peak Floating-Point Performance](https://doi.acm.org/10.1145/968280.968305)**  
Keith Underwood  
Proceedings of the International Symposium on Field-Programmable Gate Arrays, pp. 171--180, February, 2004  
*Article pointing out that FPGA performance on floating point was catching up with microprocessors and on track to surpass micoprocessor floating-point performance for many tasks.*

**[Directional and Single-Driver Wires in FPGA Interconnect](https://dx.doi.org/10.1109/FPT.2004.1393249)**  
Guy Lemieux, Edmund Lee, and Marvin Tom and Anthony Yu  
Proceedings of the International Conference on Field-Programmable Technology, pp. 41--48, December, 2004  
**Why it no longer makes sense to have mult-driver, bidirectional wires.**

#### 2005

**[The Stratix II Logic and Routing Architecture](https://doi.acm.org/10.1145/1046192.1046195)**  
David Lewis, Elias Ahmed, Gregg Baeckler, Vaughn Betz, Mark Bourgeault, David Cashman, David Galloway, Mike Hutton, Chris Lane, Andy Lee, Paul Leventis, Sandy Marquardt, Cameron McClintock, Ketan Padalia, Bruce Pedersen, Giles Powell, Boris Ratchev, Srinivas Reddy, Jay Schleicher, Kevin Stevens, Richard Yuan, Richard Cliff and Jonathan Rose  
Proceedings of the International Symposium on Field-Programmable Gate Arrays, pp. 14--20, February, 2005  
*A contemporary FPGA architecture.*

**[BEE2: A High-End Reconfigurable Computing System](https://dx.doi.org/10.1109/MDT.2005.30)**  
Chen Chang, John Wawrzynek, and Robert W. Brodersen  
IEEE Design and Test of Computers, Volume 22, Number 2, pp. 114---125, 2005  
*A contemporary reconfigurable computing platform.*

**[Dynamic voltage scaling for commercial FPGAs](https://dx.doi.org/10.1109/FPT.2005.1568543)**  
C. T. Chow, L. S. M. Tsui, Philip H. W. Leong, Wayne Luk, and Steve J. E. Wilton  
Proceedings of the International Conference on Field-Programmable Technology, pp. 173--180, December, 2005  
*Shows how to exploit dynamic voltage scaling on off-the-shelf FPGAs.*

**[Reconfigurable Computing: Architectures and Design Methods](https://dx.doi.org/10.1049/ip-cdt:20045086)**  
T.J. Todman, G.A. Constantinides, S.J.E. Wilton, O. Mencer, W. Luk, and P.Y.K. Cheung  
Computers and Digital Techniques, IEE Proceedings, Volume 152, Number 2, pp. 193---207, March, 2005  
*A recent survey paper on reconfigurable computing platforms and design with a wealth of references.*

#### 2006

**[Stream Computations Organized for Reconfigurable Execution](https://dx.doi.org/10.1016/j.micpro.2006.02.009)**  
André DeHon, Yury Markovsky, Eylon Caspi, Michael Chu, Randy Huang, Stylianos Perissakis, Laura Pozzi, Joseph Yeh, and John Wawrzynek  
Journal of Microprocessors and Microsystems, Volume 30, Number 6, pp. 334--354, September, 2006  
*Scalable compute model for reconfigurable systems based on stream-connected concurrent operators. Illustrates how we can design applications at a high level and efficiently and automatically map them to physical hardware platforms with a wide-range capacities.*

**[FPGA Design Automation: A Survey](https://dx.doi.org/10.1561/1000000003)**  
Deming Chen, Jason Cong, and Peichen Pan  
In Foundations and Trends in Electronic Design Automation, Volume 1, Number 3, pp. 195--330, November, 2006  
*Modern survey of FPGA CAD algorithms.*

#### 2007

**[Measuring the Gap Between FPGAs and ASICs](https://dx.doi.org/10.1109/TCAD.2006.884574)**  
Ian Kuon and Jonathan Rose  
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Volume 26, Number 2, pp. 203--215, February, 2007  
*Modern effort to quantify the relative area, power, and delay of FPGAs compared to ASICs.*

**[RAMP: Research Accelerator for Multiple Processors](https://dx.doi.org/10.1109/MM.2007.39)**  
John Wawrzynek, David Patterson, Mark Oskin, Shih-Lien Lu, Christoforos Kozyrakis, James C. Hoe, Derek Chiou, and Krste Asanovic  
IEEE Micro, Volume 27, Number 2, pp. 46---57, 2007  
*An important, modern reconfigurable platform for emulation and simulation. With the growth in FPGA capacity, this effort can contemplate the emulation of systems containing hundreds to thousands of processor cores, where each FPGA is modeling several processors.*

**[FPGA Architecture: Survey and Challenges](https://dx.doi.org/10.1561/1000000005)**  
Ian Kuon, Russell Tessier, and Jonathan Rose  
Foundations and Trends in Electronic Design Automation, Volume 2, Number 2, pp. 135--253, 2007  
*Modern survey of FPGA Architecture.*

#### 2008

**[Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation](https://doi.org/10.1016/B978-0-12-370522-8.X5001-8)**  
Scott Hauck and André DeHon  
Elsevier, 2008  
*Comprehensive book that covers all aspects of computing with FPGAs and FPGA-like components including device architecture, programming approaches, CAD flows, design issues, and sample applications.*

**[A Desktop Computer with a Reconfigurable Pentium](https://doi.acm.org/10.1145/1331897.1331901)**  
Shih-Lien L. Lu and Peter Yiannacouras and Taeweon Suh and Rolf Kassa and Michael Konow  
Transactions on Reconfigurable Technology and Systems, Volume 1, Number 1, March, 2008  
*Demonstration that a Pentium processor can be implemented on less than half of a modern FPGA.*

#### 2009

**[VPR 5.0: FPGA CAD and Architecture Exploration Tools with Single-Driver Routing, Heterogeneity and Process Scaling](https://doi.acm.org/10.1145/1508128.1508150)**  
Jason Luu, Ian Kuon, Peter Jamieson, Ted Campbell, Andy Ye, Wei Mark Fang, and Jonathan Rose  
Proceedings of the International Symposium on Field-Programmable Gate Arrays, pp. 133--142, February, 2009  
*Update of key open-source physical design tool including updated studies on LUT size, cluster size, and segmentation.*