Category Archives: Chip

how to use Xilinx AXI IIC (I2C) controller with SCCB devices and no pull-up on SCL

Omnivision Image sensors support a serial control interface called SCCB Omnivision SCCB spec

The main difference between SCCB & I2C is that it’s specified as SCLK being output only and active driver instead of open-drain (or tri-state or open-collector). If one actually wants to remove the pull-up on the SCLK and use this feature, it causes a minor issue with some I2C IP blocks in FPGA as they assume open-drain SCLK. Specifically in Xilinx AXI IIC controller, the IP generates 3 signals (scl_i, scl_o, scl_t) which are normally used to control a set of IBUF & OBUFT pads to drive scl_io PAD. The way to use this IP in an SCCB setting is to use the scl_t signal to drive the scl_io. Xilinx OBUFT has an active high tri-state so scl_t is high when scl_io needs to be pulled-up and it’s low with scl_o being also low. So scl_t carries the intent of scl_io signal when active driven. Also scl_i should be connected to scl_t for the IP to sense the driven signal.

One final complication is that if one makes the IIC interface external and uses some of the signals in the Vivado block diagram, Vivado makes all remaining signals external too which creates non-existing pins so all scl_x signals (other than scl_t) should be consumed inside the block diagram. scl_i being connected to scl_t takes care of that signal but scl_o should also be consumed. One can use it as an active low source or “or” it to another signal (it’s always zero so it can’t hurt) for it to be not exported out of the top level wrapper.

Incremental Place & Route with Xilinx Vivado toolset

I’ve been working on a design which is implemented on a Xilinx Series 7 FPGA. I had a -50ps timing violation and it seemed that the placement of the cells in the critical path could have been done better. In an ASIC flow this would be quite easy to fix but previous generation of Xilinx tools didn’t support this functionality easily. The current Vivado toolset is significantly better. It turns out after a design placed & routed, it is possible to move cells with “unplace_cells, place_cells” commands and run route_design again which in turn repairs all the generated DRCs and generates another valid implementation. Kudos to Xilinx for delivering a good set of tools worthy of the Series 7 FPGAs.
BTW the final timing result is -8ps which is good enough for this prototype; although fixing it would not have been difficult if needed.

Direct Digital Synthesis (DDS)

Apparently there is a nice Wikipedia entry at Also Analog has a nice tutorial here:

This has been suggested:

* Use a longer Sine lookup table instead of interpolation.
* Store only 1/4 cycle of the sinewave in the lookup table and use bit operations on the address and output to map the 1/4 cycle onto the full wave.
* Use a dual-port RAM for the lookup table to simultaneously generate sine & cosine (useful for digital radio applications)

Inverse sinc filters are common ways to equalize the spectral droop caused by the zero-order hold nature of the DAC. Typically a simple FIR filter with a few taps (<10) can ‘lift’ the high frequency response of the signal to compensate for this rolloff. This page describes one way to do it:

Some more description in another post:

Interpolation in a DDS is usually handled differently than in, say, an interpolation filter. Normally it is done with a Taylor polynomial, which yields much better results than a linear interpolator. The usual problem with a Taylor polynomial is that it requires derivatives of the function. In a DDS, though, the derivitaves of the sine and cosine
functions are the very same sines and cosines (and their opposites). So with a BRAM-based lookup table with two read ports, you can read both sine and cosine at the same time, so you effectively also have, for free, the first (and second and third, etc.) derivatives of the outputs.
So then with little hardware you can make a first-order Taylor, and with a bit more you could even make a second-order, although rarely is this necessary. I’ll send you a Xilinx paper that explains this. It’s by Chris Dick and Fred Harris and called “Direct Digital Syhthesis -Some Options for FPGA Implementation”.

Quite a nice summary:

For high precision sinusoids in FPGA’s with multipliers, I’d try dusting off the technique from the vintage 1970
Tierney/Rader/Gold paper [Ref 1] and doing something like :

– Upper two{three} phase bits used for quadrant{octant} folding

– next N phase bits look up a ‘coarse’ IQ value ( coarse phase index, yet precise amplitude )

– next M phase bits look up a ‘fine’ IQ value ( residual rotation )

– complex multiply rotates coarse IQ by fine IQ

Figure six of their paper has a nice graphical summary of the technique.

The beauty of this scheme is that it is an exact computation, not an approximation; I haven’t worked out the error terms for 18×18 or 36×36 multipliers, but I’d expect you could easily do a computation to twenty-something bits of precision with two comfortably-fit-in-BRAM sized lookup tables and one complex multiply.

Their actual implementation with 1970-era TTL took some shortcuts to conserve hardware, e.g. approximate the fine cosine values as ~1.0 [Ref 2] is a great DDS reference that reprints that early paper, along with summaries of other sine computation methods [Ref 3, Ref 4]

[Ref 1] “A Digital Frequency Synthesizer”, Tierney/Rader/Gold,IEEE Transactions on Audio and Electroacoustics, March 1971

[Ref 2] “Direct Digital Frequency Synthesizers, Kroupa (ed) IEEE Press, 1996

[Ref 3] “The Optimization of Direct Digital Frequency Synthesizer Performance in the Presence of Finite Word Length Effects” Nicholas/Samueli/Kim, Proceedings of the 42nd Annual Frequency Control Symposium,1988

[Ref 4] “Methods of Mapping from Phase to Sine Amplitude in Direct Digital Synthesis”, Vankka IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, March 1997