Define INPUT_PIPELINE parameter, which can be used to activate the
REGISTER_INPUTS parameter of the PHY. This parameter will add an
additional register stage into the incoming parallel data stream.
It can be used to relax the timing margin between the PHY and Link modules.
This patch contains an initial effort to support the Stratix 10
architecture in our JESD204 framework.
Several instances were updated, doing simple context switching using the
DEVICE_FAMILY system parameter:
- xcvr_reset_control
- lane PLL (ATX PLL)
- link PLL (fPLL)
- native XCVR instance
Apart from the slightly different parameters of the instances above,
there were small differences at the reconfiguration Avalon_MM interface.
The link_pll_reset_control is required just for Arria10, so in case of
Stratix10 it isn't instantiated.
In Stratix 10 architecture there are several additional ports of the
xcvr_reset_control module that must be connected to the native XCVR
instance or tied to GND.
The following xcvr_reset_control ports were defined and connected to the
XCVR:
- rx|tx_analogreset_stat
- rx|tx_digitalreset_stat
- pll_select
If dac_valid is not a constant '1' it gets synchronized with the
dac_data_sync signal. This causes that dac_valid never asserts while
dac_data_sync is high, this way skipping the phase initialization.
ADRV9001 interfacing IP supports the following modes on Xilinx devices:
A B C D E F G H
CSSI__1-lane 1 32 80 80 2.5 SDR 8
CSSI__1-lane 1 32 160 80 5 DDR 4
CSSI__4-lane 4 8 80 80 10 SDR 2
CSSI__4-lane 4 8 160 80 20 DDR 1
LSSI__1-lane 1 32 983.04 491.52 30.72 DDR 4
LSSI__2-lane 2 16 983.04 491.52 61.44 DDR 2
Columns description:
A - SSI Modes
B - Data Lanes Per Channel
C - Serialization factor Per data lane
D - Max data lane rate(MHz)
E - Max Clock rate (MHz)
F - Max Sample Rate for I/Q (MHz)
G - Data Type
H - DDS Rate
CSSI - CMOS Source Synchronous Interface
LSSI - LVDS Source Synchronous Interface
Intel devices supports only CSSI modes.
De-assert dac_rst together with an updated control set.
This allows writing the control registers before releasing the reset.
This is important at start-up when stable set of controls is required.
De-assert adc_rst together with an updated control set.
This allows writing the control registers before releasing the reset.
This is important at start-up when stable set of controls is required.
Allow monitoring of non-PN patterns which have zeros in it.
e.g. nible-ramp, full range ramp.
Singular zeros got ignored if not out of sync, while OOS_THRESHOLD
consecutive zeros or non-matching data asserts the out of sync line.
Fix the *_ip.tcl scripts for axi_spi_engine and spi_engine_offload
module.
In case of a bool parameters the value_format and value properties must
be set for both user and hdl paramters. If not, in the generated verilog
code the tool will use "true" or "false" strings, instead of 0 or 1.
The input data path has a delay section that compensates for the ADC path delay.
By using a Dynamic Shift Registers coding style we can improve/change the
resource utilization on m2k:
Before After Resources
LUT 10097 10048 48 (0.28%)
LUTRAM 516 540 -24 (-0.4%)
FF 15285 14803 482 (1.37%)
The number of delay taps in the LA data path can be controlled manually, from
the regmap or automatically, according to the axi_adc_decimate's rate.
Moreover, because the rate is configure by software, and the time of
initialization, is different for the ADC path and LA path. There is an
uncertainty of plus/minus one sample between the two. Because ADC and LA
paths share the same clock we can easily synchronize the two paths. We
can't use reset, because the rate generation mechanism is different
between the two. So the ADC path is used as master valid generator and we
can use it to drive the LA path.
The synchronization is done by setting the rate source bit. This
mechanism can only be used if the desired rate for both path is equal,
including oversampling fom ADC decimation.
Adds information on:
- Log 2 of interface data widths in bits
- Interface type (0 - Axi MemoryMap, 1 - AXI Stream, 2 - FIFO ) .
Lets the driver discover interface widths and interface type settings,
this will deprecate the corresponding device tree properties.
This is useful in case of parametrized projects where the width of
the datapath is changing. This change will allow the use of a generic
device tree node.
Updated version to 4.3.a
Optimize the oversampling mechanism.
The behavior of the axi_dac_interpolate was changing if a debug module was
added to the core.
The current code has a better utilization and reliability.
When using an oversampling of 2 for axi_dac_interpolate the rate was
the same as with oversampling by 1(bypassing).
This commit removes the bypass for the ratio of 2.
For projects where the clock ratio between the sampling clock and core clock
is higher than 2, the ad_dds generates a number of samples equal with
the clock ratio. There is a phase offset between the samples, proportional
with the requested DDS frequency.
In scenarios where the DDS out frequency is closer to the upper
limit(Nyquist) and/or the clock ratio is also greater than 2 and the
dac_data_sync reminds low for an extended period of time, the DAC will
receive at each core clock period, a number of samples equal with the
clock ratio and with an amplitude influenced by the DDS out frequency.
In most cases similar with a sawtooth signal.
With this commit we ensures that samples received by the DAC are 0 for
the period where dac_data_sync signal is high. Only when the signal
transitions to low, the phase accumulator is initialized and the phase
information is passed to the phase to amplitude converter.
Another issue can appear when the sync signal is too short; less then
CLK_RATIO * clock cycles. Because the phase accumulator will not
synchronize at all stages, the final result will be a random combination of
sine-waves. Added a minimum sync pulse after the dac_data_sync is set
low.
When frame alignment error monitoring is enabled and error threshold is met
at least for one lane, generate an interrupt so software can reset the link and
do further bring-up steps.
Add support for RX frame alignment character checking when scrambling is enabled and
for link reset on misalignment.
Add support for xcelium simulator to jesd204/tb
The Pattern generator is part of the axi_logic_analyzer core.
The trigger signal can be internal (Oscilloscope or Logic Analyzer) or
external(TI or TO pins).
The sdo_enabled and sdi_enabled control lines are generated from the
current state of the CMD bus.
In case of a delayed SDI latching the sdi_enabled can be deasserted at
the moment of the last valid bit, losing the generation of the sdi_data_valid
signal, which eventually cause a data loss, or even deadlock on software driver.
To make the logic mode robust, latch the value of the CMD[9:8] at every
transfer command. Doing so the sdo_enabled and sdi_enabled control lines will
store the last active transfer command state and they will be
independent of the current state of the CMD bus. This way we can add
longer time delay to the SDI latching if it's necessary.
Having the same name for dac and adc TPLs creates conflict in the
address segment naming having random names associated to the segments.
This causes difficulties during scripting of the project in test bench
mode.
The value of the HDL parameter NUM_OF_SDI can be read out from the
register at address 0x0C. The same register contains the value of the
DATA_WIDTH.
The register has the following bit layout:
[15: 0] DATA_WIDTH
[23:16] NUM_OF_SDI
[31:24] 8'b0
Forward the offload's sync_id to the register map, by defining an
additional AXI stream interface between the offload and axi_spi_engine.
The last sync_id of the offload module can read out from the
register 0x00C4. It also can generate and interrupt if the irq mask is
configured accordingly.
There is a major compatibility issue between 2019.1 and 2019.2.
The file system_top.hdf got a different file extention. This will
cause a compilation failer in the end of the build. To save time
and fail earlier, upgrade the version mismatch message to ERROR.
If user still wants to build a branch with different tool version
the variable ADI_IGNORE_VERSION_CHECK should be set to 1.
The external synchronization signal should be synchronous with the
dac clock. Synchronization will be done on the rising edge of the signal.
The control bit is self clearing. Status bit shows that the synchronization
is armed but the synchronization signal has not yet been received
Added EXT_SYNC parameter to be able to keep the dac_sync original
behavior
The external synchronization signal should be synchronous with the
adc clock. Synchronization will be done on the rising edge of the signal.
The control bit is self clearing. Status bit shows that the synchronization
is armed but the synchronization signal has not yet been received. While
the synchronization mechanism is armed, the adc_rst output signal is set
The current format should allow for the SYSREF signal to be used as
synchronous capture start, but will need to be disabled before the
synchronization mechanism is armed
The commit 9ab88f1200 introduced a new
feature for the execution module, which provides the possibility to
delay the SDI line latch with one or more core clock cycle. Unfortunatly
the implementation was not correct and the SDI line was latched at the
wrong time.
This patch fix the aligment of the shift register and the SDI_DELAY parameter,
to latch the SDI line of the physical interface at the right time.
Improve the description of the feature.
In some cases, the Vivado version can contain other characters than just
numbers. One such example is after applying the patch of AR# 71948,
which makes `version -short` return something like `2018.3_AR71948`.
This patch changes the version check to ignore anything after the first
two components of the version.
Add definition for new ultrascale device packages.
The package information is used by software for xcvr calibration.
At the moment, the factors that are influencing the calibration for the new
packages are not clear.
The previous mechanism was "probing" the DMAs for valid data. Better said,
each interpolation channel enabled it's DMA until a valid data was received,
then it disabled the DMA read and waited for the adjacent channel(DMA) to
receive a valid data. Only when for both channels had valid data on the
DMAs interfaces was the transmission started. This added an undesired and
redundant complexity to the interpolation channels. Furthermore, for continuous
transmission, using the above mechanism lead to a fixed phase(sample)
shift between the two channels at each start.
By using the streaming mechanism the interface is simplified and the
above problems are solved.
For Intel projects:
In cases where the clock of source synchronous interface is not routed
through a clock capable pin the DPA receive mode can't be used. Instead
the clock will be routed through a clock buffer from an IO to the clock
tree and from there to the IOPLL.
implemented mux for temp reading either from internal or external
source; updated regmap; added param to identify source for temp
information; updated tacho measurements; added AVG_POW param used
for tacho measuremet average useful for simulations; defaults for
tacho measurements changed to params and added registers; added
prescaler for fsm control, FSM updated; changed register write
process; connected INTERNAL_SYSMONE to regmap, value can now be
read by software;
parameters with same names were duplicated with transceiver specific
names due different default values.
This does not scales very well.
Use same name for the parameters as for other parameters and do the
default value handling in the IP configuration layer.
In order to help timing closure on multi SLR FPGAs add a pipeline stage
between the link layer and physical layer. This will add a fixed amount
of delay to the overall latency.
Bus sizes often depend on parameters. In such cases the physical indexes
of the interfaces from the multi bus must be calculated based on parameters.
For each interface expose the formula that calculates the indexes to the
block design.
* jesd204b: add bonding clocks feature (fix for some routing issues)
* intel/adi_jesd204: bonding clock feature invisible in QSYS GUI if number of lanes is less than 6
* intel/adi_jesd204: clock network option renamed according to intel documentation
* intel/adi_jesd204: Hide BONDING_CLOCKS_EN parameter in RX mode
Co-authored-by: István Csomortáni <Csomi@users.noreply.github.com>
When using a non-maximum sampling rate the data is captured earlier by two
samples.
After the initial trigger jitter fix, a low latency/utilization was
desired(one sample delay for the trigger detection). After adding the
instrument trigger an equal latency between ADC and LA was required, hence the
need for a two sample delay on the trigger path. The delay was implemented
as two clock cycle delays not two sample delays.
This commit fixes this issue and offers a more robust design.
A trigger jitter was added by fix on the external trigger input. It
manifests at input sampling frequencies lower than the maximum frequency.
Added the required reset and CE(valid) signal to the last output
stages of the trigger to obtain the desired functionality for all
sampling rates.
The extra delay was added on the trigger and data paths to compensate
for the logic analyzer changes.
The extra delay will be also seen on the m2k daisy chain. The
delay between devices will be increased from 3 to 4 samples delay.
Fix external trigger for low sampling rates.
Because the external trigger can be a short pulse at high decimation rates
there is a high chance that the pulse will be missed.
xsim does not like if a register or wire is used before their
definition. Make sure the every register and wire is defined before it's
used the first time.
In Subclass 1 mode, we need to use a separate clock (device clock) to
drive the link and transport layer of the interface. Implement the
required infrastructure for this scenario.
The clock domain crossing will be done in by the TX|RX_FIFO in the PCS.
In Subclass 1 mode an external device clock (core clock) is used,
instead of the PCS output clock, to drive the link and transport layer.
Define an additional parameter, which can be used to enable clock input
port for the PHY module, which can be used as rx|tx_coreclkin source.
This commit reverts part of the changes done in the following commit:
- ff50963c7f -
"axi_ad9361- altera/xilinx reconcile- may be broken- do not use"
The above mentioned commit introduced latency variations on the Rx path
at different sample rates, or within the same sample rate after sample
rate changes. The variation is caused by multiple positions of the frame
detection combined with a free running toggle (rx_valid) that is not synchronized
with the actual samples.
Having a single frame detection position eliminates the latency
variation.
When having multiple 936x in parallel, this change enables the use of source
synchronous received clock from the master as sampling clock for other slaves.
This will eliminate skew between the interfaces since the data delays
are going to be tuned against the master clock after a multi-chip
synchronization (MCS) is done. This eliminates the clock crossing from
the slave to master domain inside the FPGA.
Sync the two valid signals to keep a fixed phase relationship between
the Rx ant Tx channels, this way avoiding +/- 1 sample differences
on the Tx-Rx latency between consecutive transfers.
The pulse period had a fixed value. Therefore, in order to be able
to configure it from the software, a 32b register pulse_period_reg
was added in axi_spi_engine. Also, to generate the pulse, the
output register pulse_gen_loadc was added.
- Add parameter for input data delay time to easily match the one of the
adc_trigger.
- Change the trigger delay path to match between the internal and
external(adc_trigger delays).
The ready signal of the SYNC interface should be always 1'b1,
regardless of ASYNC_SPI_VALUE.
Drive the ready with one in both branches of the ASYNC_SPI_CLK
generate block.
Currently trigger out pin is hold for 1ms in the next translation(t+1)
state(0 or 1). But not in the state that follows (t+2). This commit
fixes this issue and simplifies the logic.
The previous channel sync mechanism was simply holding the transmission by
pulling down the dma_rd_en of the two DMAs for each channel(set reg 0x50). After a
period of time (that will take the two DMAs to have the data ready to move)
the dma_rd_en was set for both channels, resulting in a synchronized start.
This mechanism is valid when the two channels are streaming the same
type of data (constant, waveform, buffer or math) at close frequencies.
Streaming 10MHz on a channel and 100KHz on the second one will result
in different interpolation factors being used for the two channels.
The interpolation counter runs only when the dma_transfer_suspended(reg 0x50)
is cleared. Because of this, different delays are added by the interpolation
counter one DMA with continuous dma_rd_en will have data earlier than the
one with dma_rd_en controlled by the interpolation counter. Furthermore,
because the interpolation counter value is not reset at each
dma_transfer_suspended, the phase shift between the 2 channels will
differ at each start of transmission.
To make the transfer start synced immune to the above irregularities a
sync_transfer_start register was added (bit 1 of the 0x50 reg).
When this bit is set and the bit 0(dma_transfer_suspended) is toggled,
the interpolation counters are reset. Each channel enables it's DMA until
valid data is received, then it waits for the adjacent channel to get valid data.
This mechanism will be simplified in a future update by using a streaming
interface between the axi_dac_interpolate and the DMAs that does not require
the probing of the DMA.
The decimation module controls the valid signal. The whole triggering mechanism
is active only when the valid signal is active.
In the case of low sampling rates, the valid signal is active once every
n clock cycles. If an external trigger condition is fulfilled and the data valid
signal is low at the time, that trigger will be ignored by the DMA.
To solve this issue, the trigger is held high until the valid is asserted.
And it stays high for at least one clock cycle.
The trigger signal that goes to the DMA(fifo_wr_sync) does not pass through
the variable fifo, for this reason, a 3 clock cycles delay is required, to
keep in sync the data with the trigger.
On the other hand, to be able to cascade the axi_logic_analyzer with
axi_adc_trigger, there should be small delays on the trigger path, for this
reason the trigger_out_adc was created.
Remove the extra delays on the trigger_i(external trigger pins).
- Add embedded trigger as an option. The use of the embedded trigger as an
option in the data stream is done for further processing, keeping the data
synchronized with the trigger.
When instrument (module) trigger is desired (logic_analyzer - adc_trigger),
a small propagation time is required, hence the need to remove the
util_extract(trigger extract) module from the data path.
- Add more options for the IO triggering. This will open the door for multiple
M2k synchronization(triggering).
trigger_o mux:
1 - trigger flag (from regmap)
2 - external pin trigger (Ti)
3 - external pin trigger (To)
4 - internal adc trigger
5 - logic analyzer trigger
The signal passed to trigger_o must not be delayed, but the new value has to be
kept for a short period, 1ms (100000 clock cycles), to reduce switch noises in
the system.
The axi_adc_trigger handles 3 output triggers:
- trigger_o - external trigger (1 clock cycle delay)
- trigger_out - signals on dmac/fifo_wr_sync the start of a new transfer.
A variable fifo depth is present in the data path, which delays the data
arriving at the DMA with 3 clock cycles. By coincidence, the external trigger
is synchronized and detected on 3 clock cycles. To get a maximum optimization
the trigger_out will be delayed with 3 clock cycles for internal triggers and
directly forwarded in the case of an external trigger.
- trigger_out_la (cascade trigger for logic_analyzer - m2k example)
Because the trigger_out_la must have a small delay, to get a realible
instrument triggering mechanism, a 1 delay clock cycle must be added on the
trigger paths, to avoid creating a closed combinatorial loop.
Increase pcore version. The major version 3 is used to describe the instrument
trigger updates.
This commit was created by squashing the following commits, these
messages were kept just for sake of history:
ad9694_500ebz: Mirror the SPI interface to FMCB
ad9694_500ebz: Set transceiver reference clock to 250
ad9694_500ebz: Allow to configure number of lanes, number of converters
and sample rate
axi_ad9694: Fix number of lanes, it must be 2
ad9694_500ebz: Update the mirrored spi pin assignments
ad9694_500ebz: Gate SPI MISO signals based on chip-select
ad9694_500ebz: Set channel pack sample width
ad9694_500ebz: Change reference clock location
ad9694_500ebz: Remove transceiver memory map arbitration
ad9694_500ebz: Ensure ADC FIFO DMA_DATA_WIDTH is not larger ADC_DATA_WIDTH
ad9694_500ebz: Adjust breakout board pin locations
ad_fmclidar1_ebz: Rename the ad9694_500ebz project
ad_fmclidar1_ebz: Fix lane mapping
ad_fmclidar1_ebz: Delete deprecated files
ad_fmclidar1_ebz: Integrate the axi_laser_driver into the design
ad_fmclidar1_ebz: OTW is an active low signal
ad_fmclidar1_ebz: zc706: Fix iic_dac signals assignment
ad_fmclidar1_ebz: Switch to util_adcfifo
ad_fmclidar1_ebz: Enable synced capture for the fifo
ad_fmclidar1_ebz/zc706: Enable CAPTURE_TILL_FULL
ad_fmclidar1_ebz/zc706: Reduce FIFO size to 2kB
ad_fmclidar1_ebz: Laser driver runs on ADC's core clock
ad_fmclidar1_ebz_bd: Delete the FIFO instance
Because the DMA transfers are going to be relatively small (< 2kbyte),
the DMA can handle the data rate, even when the frequency of the laser
driver pulse is set to its maximum value. (200 kHz)
The synchronization will be done by connecting the generated pulse to
the DMA's SYNC input. Although, to support 2 or 1 channel scenarios, we
need to use the util_axis_syncgen module to make sure that the DMA
catches the pulse, in cases when the pulse width is too narrow. (SYNC is
captures when valid and ready is asserted)
Also we have to reset the cpack IP before each pulse, to keep the DMA buffer's
relative starting point in time fixed, when only 2 or 1 channel is
active.
The module can receive a synchronous or asynchronous pulse with an arbitrary
width and generate a SYNC signal for the DMA Source AXI Streaming interface.
This way we can synchronize the DMA transfers to an external
pulse/signal.
The laser driver contains the axi_pulse_gen's IP and an additional
register map which controls/monitor the laser driver enable control line
and the over temperature warning line (OTW).
It also contains an interrupt logic, which allows to generate an
interrupt in function of the generated pulse or incoming OTW signal.
The IPs register maps looks as follow:
0x00 - axi_pulse_gen register map
0x80 - axi_laser_driver register map
0x80 - DRIVER_ENABLE
0x84 - DRIVER_OTW
0x88 - EXT_CLK_COUNTER
0xA0 - IRQ_MASK
0xA4 - IRQ_SOURCE
0xA8 - IRQ_PENDING
0xAC - SEQUENCER_CONTROL
0 - SEQUENCER_ENABLE
1 - AUTO_SEQUENCER_ENABLED
0xB0 - SEQUENCER_SYNC_OFFSET
0xB4 - AUTO_SEQUENCE
[ 1: 0] - CHANNEL_SEL_0
[ 5: 4] - CHANNEL_SEL_1
[ 9: 8] - CHANNEL_SEL_2
[13:12] - CHANNEL_SEL_3
0xB8 - MANUAL_SEQUENCE
[ 1: 0] - MANUAL_CHANNEL_SEL
Current interrupt sources scheme is:
- bit 0 : pulse (triggered by the level of the pulse)
- bit 1 : OTW_N enter (triggered by positive edge of the OTW_N)
- bit 2 : OTW_N exit (triggered by the level of the pulse)
Generate a reset signal before the pulse which can be used to reset
various IP's of the data path (eg. pack/cpack). This can help to clear out the
internal buffers and registers of these IP, starting clean at the moment when
the actual pulse arrives.
The sequencer has an auto and a manual mode, and can be set to custom
sequences of the TIA channel selection lines sate.
The sequencer in auto mode is synchronized to the pulse, it will change
its state before a generated pulse which will drive the lasers. The
offset between the sequencer beat and the laser driver pulse can be
modified through an AXI register.
- add missing false paths
- change the bus skew constraint to a false path, for some reason the
tool does not change the path's requirement after a set_bus_skew
constraint
Our internal repository was changed from phdl to ghdl. Update the
adi_env.tcl scripts and other scripts, which depends on the $ad_ghdl_dir
variable. This way the tools will see all the internal IPs too.
Add additional synchronization FIFOs to several interfaces of the
axi_spi_engine module, to prevent metastability and timing issues in
case when the system clock and the SPI clock are asynchronous.
There are devices where the SDO default state, between transactions, is
not GND, rather VCC.
Define a parameter, which can be used to set the default state of the
SDO line.
Move the subtraction outside of the always block. In this way we're not adding
an additional delay element on to the output of the differentiator,
which brakes the transfer function of the filter.
This patch will fix the following warning:
[Synth 8-689] width (16) of port connection 'up_axi_awaddr'
does not match port width (12) of module 'up_axi'
Vivado propagates and auto derives the clocks, however if multiple
instances of this components are used the names of the propagated clock
change while the constraint file has fixed name which will match only
the clocks from the first instance letting the second instance of the
clock div without exception.
Use missing MIMO_ENABLE parameter, which will insert
and additional de-skew logic to prevent timing issues coming from
the clock skew differences of two or multiple AD9361.
Define a MIMO_ENABLE parameter for the core, which will insert
and additional de-skew logic to prevent timing issues coming from
the clock skew differences of two or multiple AD9361.
Let the measured transfer length to be cleared at the end of each
transfer, other case in cyclic mode the counter will overflow and will
not present any useful information.
Once xfer_request is set the DMA must accept samples in the same clock
cycle if the fifo_wr_en signal is asserted.
If the req_valid asserts faster than the ID gets synchronized over the
the xfer request asserts without being ready to accept data.
This can lead to overflow assertion when using a FIFO like interface.
This patch addresses the following issue:
In case of transfers with multiple segments, if TLAST asserts on the last
beat of a non-last segment while more descriptors are queued up,
the completions for the queued segments may be missed causing timeout in
processes that wait for transfer completions.
This patch addresses the following issue:
In 2D mode when consecutive partial transfers occur, and the latter is
very short, will interfere with the completion mechanism of the first
transfer leading to uncompleted segments and unreported partial
transfers.
The tb_base.v verilog files does not contain a full module definition,
just some plain test code. In general the files is sourced inside the
test bench main module. As is, defining a timescale in these files will
generate an error, because timescale directive can not be inside a
module.
Delete all the timescale directive from these files.
When only one converter is used there is no need for concatenation and
slicer cores. In that case the TPL will connect to port 0 from the
application layer.
These parameters must be overwritten when the link is at 15Gbps.
The parameters have a GTY4_ prefix since the same parameters are shared
between GTY4 and GTH4 having different default values.
The interrupt controller from Microblaze based projects requires that
all its inputs have attributes which define the sensitivity of the
interrupt line. Other case it defaults to EDGE_RISING which is not the
case for DMAC, leading to incorrect interrupt reporting and handling in
case of such projects.
Out of Context constraints are needed for timing driven synthesis as for
avoiding critical warnings due clock queries.
The memory from the FIFO is inferred in different ways for high clock
speeds. Assume the highest frequency for all projects.
Fix library makefiles dep list using generic vendor info reg
Combine adi_int_bd_tcl with adi_auto_fill_bd_tcl procedure.
This change will simplify the process of generating makefiles for each library.
Removing the bd.tcl script from the adi_ip_files list will remove it from the
make dependency list.
Having a bd.tcl script in every IP is redundant.
adi_ip.tcl:
- add adi_init_bd_tcl - creates a blanch bd.tcl and a
parameters temporary_case_dependencies.mk when compiling an IP.
Its main purpose is to generate the bd.tcl, which will be included in
the IP's file-set.
- adi_auto_fill_bd_tcl will populate the empty bd.tcl based on the
top IP parameters and the presence of these parameters in
auto_set_param_list and auto_set_param_list_overwritable lists.
This task can not be performed by the first described procedure since
the file-set is not yet defined.
adi_xilinx_device_info_enc.tcl:
Split auto_set_param_list_overwritable from auto_set_param_list. As
the name states, some of the parameters are overwritable, this will help
when generating the bd.tcl script.
library.mk:
Include the temporary_case_dependencies.mk if it exists in the
IP root folder. The mentioned *.mk file contains non generic
dependencies for makefiles like targets to clean.
Common basic steps:
- Include/create infrastructure:
* Intel:
- require quartus::device package
- set_module_property VALIDATION_CALLBACK info_param_validate
* Xilinx
- add bd.tcl, containing init{} procedure. The init procedure will be
called when the IP will be instantiated into the block design.
- add to the xilinx_blockdiagram file group the bd.tcl and common_bd.tcl
- create GUI files
- add parameters in *_ip.tcl and *_hw.tcl (adi_add_auto_fpga_spec_params)
- add/propagate the info parameters through the IP verilog files
axi_clkgen
util_adxcvr
ad_ip_jesd204_tpl_adc
ad_ip_jesd204_tpl_dac
axi_ad5766
axi_ad6676
axi_ad9122
axi_ad9144
axi_ad9152
axi_ad9162
axi_ad9250
axi_ad9265
axi_ad9680
axi_ad9361
axi_ad9371
axi_adrv9009
axi_ad9739a
axi_ad9434
axi_ad9467
axi_ad9684
axi_ad9963
axi_ad9625
axi_ad9671
axi_hdmi_tx
axi_fmcadc5_sync
Xilinx:
When calling adi_auto_fpga_spec_params in the x_ip.tcl, parameters like
- FPGA_TECHNOLOGY
- FPGA_FAMILY
- SPEED_GRADE
- DEV_PACKAGE
- XCVR_TYPE
- FPGA_VOLTAGE
will be automatically detected and constrained to predefined pairs of values
from adi_xilinx_device_info_env.tcl
The parameters specified in the blobk diagram of the IP(bd.tcl), will be
automatically assign when the IP is added to a block design.
The "adi_auto_assign_device_spec $cellpath" is called in the init
hook (bd.tcl).
https://www.xilinx.com/products/technology/high-speed-serial.html
Intel:
Info parameters are set in the VALIDATION_CALLBACK according to
adi_intel_device_info_env.tcl
Fix the following warning:
WARNING: [Synth 8-2611] redeclaration of ANSI port up_es_reset is not allowed
Also make sure, that in all configurations, the register has a diver.
Add support for 8 bit resolution for the transport layer.
Fix parameter BITS_PER_SAMPLES propagation to all the internal modules, in
several cases this variable was hard coded to 16.
The axi_pulse_gen is a generic PWM generator, which can be configured
through an AXI Memory Mapped interface.
The current register map look like follows:
0x00 - VERSION
0x04 - ID
0x08 - SCRATCH
0x0C - IDENTIFICATION - 0x504c5347 which stands for 'PLSG' in ASCII
0x10 - CONFIGURATION - contains reset and load bits
0x14 - PULSE_PERIOD
0x18 - PULSE_WIDTH
Also update all the other modules, which instantiate the util_pulse_gen.
To prevent the case, when after an invalid configuration, the generated
output PWM signal is constant HIGH, change the counter to a
down-counter. In this way the pulse will be placed at the end of the
PWM period, and if the configured width value is higher than the
configured period the output signal will be constant LOW.
Write code to pipeline data path for better DSP utilization on the
color space conversion.
In the old method the addition operations were performed outside the
DSPs
The FIFO functions in 'first fall through' mode, adjust the fifo level
generation so it take into account the valid data which sits on the bus,
waiting for ready, too.
Current implementation does not supports updated versions of Vivado
e.g. 2017.4.1 or 2018.2.1
This fix ignores the update number from the version checking.
Registers from this component can fit in the 2k address range.
Since Vivado's minimal address range is 4k, use that instead.
This will allow placing the independent TPLs to base addresses
that mach the addresses from the monolithic blocks ensuring no software
intervention.
The DMAC has the requirement that the length of the transfer is aligned to
the widest interface width. E.g. if the widest interface is 256 bit or 32
bytes the length of the transfer needs to be a multiple of 32.
This restriction can be relaxed for the memory mapped interfaces. This is
done by partially ignoring data of a beat from/to the MM interface.
For write access the stb bits are used to mask out bytes that do not
contain valid data.
For read access a full beat is read but part of the data is discarded. This
works fine as long as the read access is side effect free. I.e. this method
should not be used to access data from memory mapped peripherals like a
FIFO.
This means that for example the length alignment requirement of a DMA
configured for a 64-bit memory and a 16-bit streaming interface is now only
2 bytes instead of 8 bytes as before.
Note that the address alignment requirement is not affected by this. The
address still needs to be aligned to the width of the MM interface that it
belongs to.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
FPGAs support different widths for the read and write port of the block
SRAM cells. The DMAC can make use of this feature when the source and
destination interface have a different width to up-size/down-size the data
bus.
Using memory cells with asymmetric port width consumes the same amount of
SRAM cells, but allows to bypass the re-size blocks inside the DMAC that
are otherwise used for up- and down-sizing. This reduces overall resource
usage and can improve timing.
If the ratio between the destination and source port is too larger to be
handled by SRAM alone the SRAM block will be configured to do partial up-
or down-sizing and a resize block will be inserted to take care of the
remaining up-/down-sizing. E.g. if a 256-bit interface is connected to a
32-bit interface the SRAM will be used to do an initial resizing of 256 bit
to 64 bit and a resize block will be used to do the remaining resizing from
64 bit to 32 bit.
Currently this feature is disabled for Intel FPGAs since Quartus does not
properly infer a block RAM with different read and write port widths from
the current ad_asym_mem module. Once that has been resolved support for
asymmetric memories can also be enabled in the DMAC.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The handling of the src_data_valid_bytes signal and its related signal is
tightly coupled to the behavior of the resize_src module. The code that
handles it makes assumptions about the internal behavior of the resize_src
module.
Move the handling of the src_data_valid_bytes signal when upsizing the data
bus into the resize_src module so that all the code that is related is in
the same place and the code outside of the module does not have to care
about the internals.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The DMA_LENGTH_ALIGN LSBs of all length For the most part the tools are
able to deduce this using constant propagation.
But this propagation does not work across the asynchronous meta data FIFO
in the burst memory module.
Add a DMA_LENGTH_ALIGN parameter to the burst_memory module which is used
to explicitly keep the LSBs of length registers on the destination side
fixed at 1'b1. This reduces resource use and improves timing by allowing
better constant propagation and unused logic elimination.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This simplifies the burst length in the response manager significantly
while not costing much extra resources in the burst memory.
This change will also enable other future improvements.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
One of the major features of the DMAC is being able to handle non matching
interface widths for the destination and source side.
Currently the test benches only support the case where the width for the
source and the destination side are the same. Extend them so that it is
possible to also test and verify setups where the width is not the same.
To accomplish this each byte memory location is treated as if it contained
the lower 8 bytes of its address. And then the written/read data is
compared to the expected data based on that.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
On Arria10 there are 6 transceivers in a single bank. If more than 6
transceivers are used these will end up in multiple banks.
The ATX PLL can directly connect to the transceivers in the same bank
through the 1x clock network. To connect to transceivers in another bank it
has to go through a master clock generation block (MCGB) and the xN clock
network.
Add support for instantiating the MCGB if more than 6 lanes are used. In
this case the first 6 transceivers will still have a direct connection to
the PLL while all other transceivers will be clocked by the MCGB.
Note that this requires that the first 6 transceivers are all in the same
bank.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
All projects have been updated to use the new pack/unpack infrastructure.
The old util_cpack and util_upack cores are now unused an can be removed.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The util_cpack2 core is similar to the util_upack core. It packs, or
interleaves, a data from multiple ports into a single data. Ports can
optionally be enabled or disabled.
On the input side the cpack2 core uses a multi-port FIFO interface. There
is a single data write signal (fifo_wr_en) for all ports. But each port can
be individually enabled or disabled using the enable signals.
On the output side the cpack2 core uses a single port FIFO interface. When
data is available on the output interface the data write signal
(packed_fifo_wr_en). Data on the packed_fifo_wr_data signal is only valid
when packed_fifo_wr_en is asserted. At other times the content is
undefined. The cpack2 core offers no back-pressure. If data is not consumed
when it is made available it will be lost.
Data from the input ports is accumulated inside the cpack2 core and if
enough data is available to produce a full output vector the data is
forwarded.
This core is build using the common pack infrastructure. The core that is
specific to the cpack2 core is mainly only responsible for generating the
control signals for the external interfaces.
The core is accompanied by a test bench that verifies correct behavior for
all possible combinations of enable masks.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The util_upack2 core is similar to the util_upack core. It unpacks, or
deinterleaves, a data stream onto multiple ports.
The upack2 core uses a streaming AXI interface for its data source instead
of a FIFO interface like the upack core uses.
On the output side the upack2 core uses a multi-port FIFO interface. There
is a single data request signal (fifo_rd_en) for all ports. But each port
can be individually enabled or disabled using the enable signals.
This modified architecture allows the upack2 core to better generate the
valid and underflow control signals to indicate whether data is available
in a response to a data request.
If fifo_rd_en is asserted and data is available the fifo_rd_valid signal
are asserted in the following clock cycle. The enabled fifo_rd_data ports
will be contain valid data during the same clock cycle as fifo_rd_valid is
asserted. During other clock cycles the output data is undefined. On
disabled ports the data is always undefined.
If no data is available instead the fifo_rd_underflow signal is asserted in
the following clock cycle and the output of all fifo_rd_data ports is
undefined.
This core is build using the common pack infrastructure. The core that is
specific to the upack2 core is mainly only responsible for generating the
control signals for the external interfaces.
The core is accompanied by a test bench that verifies correct behavior for
all possible combinations of enable masks.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Pack and unpack operations are very similar in structure as such it makes
sense for pack and unpack core to share a common infrastructure.
The infrastructure introduced in this patch is based on a routing network
which can implement the pack and unpack operations and grows with a
complexity of N * log(N) where N is the number of channels times the number
of samples per channel that are process in parallel.
The network is constructed from a set of similar stages composed of either
2x2 or 4x4 switches. Control signals for the switches are fully registered
and are generated one cycle in advance.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Add support for Vivado's simulator. By default the run script is using
the Icarus simulator.
If the user want to switch to another simulator, it can be explicitly
specify the required simulator tool in the SIMULATOR variable.
Currently, beside Icarus, Modelsim (SIMULATOR="modelsim") and Vivado's
xsim (SIMULATOR="xsim") is supported.
For consistent simulation behavior it is recommended to annotate all source
files with a timescale. Add it to those where it is currently missing.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
By default inferred output reset signals have an active low polarity. The
axi_ad9361 rst output signal is active high though. Currently when
connecting it to a input reset with active high polarity will generate an
error in IPI.
Fix this by explicitly marking the polarity of the rst signal as active
high.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Replace the open-coded instances of a perfect shuffle in the DAC framer with
the new helper module.
Using the helper module gives well defined semantics and hopefully makes
the code easier to understand.
There are no changes in behavior.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The perfect shuffle is a common operation in data processing. Add a shared
module that implements this operation.
Having this in a shared module rather than open-coding every instance makes
sure that there are clear and well defined semantics associated with the
operation that are the same each time. This should ease review, maintenance and
understanding of the code.
The perfect shuffle splits the input vector into NUM_GROUPS groups and then
each group in WORDS_PER_GROUP. The output vector consists of
WORDS_PER_GROUP groups and each group has NUM_GROUPS words. The data is
remapped, so that the i-th word of the j-th word in the output vector is
the j-th word of the i-th group of the input vector.
The inverse operation of the perfect shuffle is the perfect shuffle with
both parameters swapped.
I.e. [perfect_suffle B A [perfect_shuffle A B data]] == data
Examples:
NUM_GROUPS = 2, WORDS_PER_GROUP = 4
[A B C D a b c d] => [A a B b C c D d]
NUM_GROUPS = 4, WORDS_PER_GROUP = 2
[A a B b C c D d] => [A B C D a b c d]
NUM_GROUPS = 3, WORDS_PER_GROUP = 2
[A B a b 1 2] => [A a 1 B b 2]
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The write logic (DMA side) has to be independent from the read logic (DAC side).
In general the FIFO is always ready for the DMA, and every DMA transaction will
interrupt the read-back process, and the module will stop sending data,
until the initialization is finished.
Bringing back the write address tot he DMA clock domain is totally
redundant, so delete it.