There is an implicit dependency between the outgoing data stream and the
incoming response stream. The AXI specification requires that the
corresponding response is not sent before the last beat of data has been
received.
We can take advantage of this and remove the currently explicit dependency
between the data and response paths. This slightly simplifies the design.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For the AXI streaming interfaces we need to make sure that the handshaking
rules for the external interface are met. Hence we can't just disable the
DMA and have to wait for any pending beats to complete.
For the FIFO interfaces on the other hand no such requirements exist. All
handshaking is for the internal pipeline which will be reset as a whole so
it is OK to violate the handshaking without causing any undefined behavior.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For the memory-mapped AXI read interface the slave asserts rlast for the
last beat in a burst.
This means we don't have to count the number of beats to know when the
burst is completed but instead can use rlast. This slightly reduces the
amount of resources needed for the MM-AXI source module and given that the
beat_counter is often the bottleneck timing wise this should also improve
the timing.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
When the DMA is disabled it should gracefully shutdown and eventually end
up in an idle state. All outstanding AXI MM requests need to complete
before the DMA is fully disabled.
Add testbenches that test this for both AXI MM read and write behavior.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The DMAC allows a transfer to be aborted. When a transfer is aborted the
DMAC shuts down as fast as possible while still completing any pending
transactions as required by the protocol specifications of the port. E.g.
for AXI-MM this means to complete all outstanding bursts.
Once the DMAC has entered an idle state a special synchronization signal is
send to all modules. This synchronization signal instructs them to flush
the pipeline and remove any stale data and metadata associated with the
aborted transfer. Once all data has been flushed the DMAC enters the
shutdown state and is ready for the next transfer.
In addition each module has a reset that resets the modules state and is
used at system startup to bring them into a consistent state.
Re-work the shutdown process to instead of flushing the pipeline re-use the
startup reset signal also for shutdown.
To manage the reset signal generation introduce the reset manager module.
It contains a state machine that will assert the reset signals in the
correct order and for the appropriate duration in case of a transfer
shutdown.
The reset signal is asserted in all domains until it has been asserted for
at least 4 clock cycles in the slowest domain. This ensures that the reset
signal is not de-asserted in the faster domains before the slower domains
have had a chance to process the reset signal.
In addition the reset signal is de-asserted in the opposite direction of
the data flow. This ensures that the data sink is ready to receive data
before the data source can start sending data. This simplifies the internal
handshaking.
This approach has multiple advantages.
* Issuing a reset and removing all state takes less time than
explicitly flushing one sample per clock cycle at a time.
* It simplifies the logic in the faster clock domains at the expense of
more complicated logic in the slower control clock domain. This allows
for higher fMax on the data paths.
* Less signals to synchronize from the control domain to the data domains
The implementation of the pause mode has also slightly changed. Pause is
now a simple disable of the data domains. When the transfer is resumed
after a pause the data domains are re-enabled and continue at their
previous state.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Move the transfer logic, including the 2d module, into its own sub-module.
This allows testing of the full transfer logic independently of the
register map logic.
The top-level module now only instantiates the register map and transfer
module, but does not have any logic on its own.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The timing exceptions for the debug paths are currently a bit to broad and
can include paths that should not have an exception.
All the debug signals are coming from the i_request_arb instance, so
include that in the match to avoid false positives.
For most projects this wont have been a problem since there is usually a
fair amount of slack on the paths that were affected by this. But in
projects with high utilization this might result in undefined behavior.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Fix the read side of the CDC data FIFO. The read address generation did not
function correctly.
Redesign the read side of the FIFO, and make sure that it becomes empty after
the DMA transfer ends; and never get stock in a cyclic mode.
The dac_last signal is not used anywhere in the module. Remove it and its
synchronization registers.
Fixes the following warnings:
[Synth 8-6014] Unused sequential element dac_dlast_reg was removed. ["axi_dacfifo_rd.v":372]
[Synth 8-6014] Unused sequential element dac_dlast_m1_reg was removed. ["axi_dacfifo_rd.v":373]
[Synth 8-6014] Unused sequential element dac_dlast_m2_reg was removed. ["axi_dacfifo_rd.v":374]
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Commit bfc8ec28c3 ("util_axis_fifo: instantiate block ram in async mode")
added the read-enable (reb) signal to the ad_mem block.
It didn't update the ad_mem instance in axi_dacfifo_address_buffer.v. This
results in the read-enable of the address_buffer being tied to 0.
Fix this by connecting the same signal that increments the read address.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The DMAC implementation guarantees that the expression `dma_valid &
dma_xfer_req` is always identical to just dma_valid.
When generating the util_dacfifo dma_wren_s signal the optimizer doesn't know
of this though and hence will route both signals into the LUT that drives
the write enable for the BRAMs.
Simplify the expression by removing dma_xfer_req from it. Considering this
can be a fairly high fan-out net and is typically the bottleneck for the
util_dacfifo timing this helps to improve the timing.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Some parts of the util_cdc library rely on dead logic elimination to remove
unused logic. Unfortunately with newer Vivado versions this results in
warnings about unused sequential elements being removed. Like:
WARNING: [Synth 8-6014] Unused sequential element cdc_sync_stage1_reg was removed.
To avoid this encase the logic in generate blocks that makes sure they are
not generated when not needed.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This reverts commit 4b1d9fc86b "axi_dmac: Modified in order to avoid
vivado crash".
Vivado no longer crashes and this structure is much more efficient when it
comes to resource usage and timing. The intention here is to create a 1-bit
memory that is N entries deep and not a N bit signal.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The burst_count signal and its derived signals are not used until the
burst_count has been explicitly initialized by loading a transfer. There is
no need to have a reset.
This reduces the fan-out of the reset signal.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The data path register of the 2d_transfer module are qualified by the
corresponding valid signal. Their content is not used until they have been
explicitly initialized. There is no need to reset them explicitly.
This reduces the fan-out of the reset signal.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
There is no need to reset the data path in the address generator. The
values of the register on the data path are not used until they have been
explicitly initialized. Removing the reset simplifies the structure and
reduces the fan-out of the reset signal.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Xilinx tools don't allow to use $clog2() when computing the value of a
localparam, even though it is valid Verilog.
For this reason a parameter was used for BYTES_PER_BURST_WIDTH so far. But
that generates warnings from both Quartus and Vivado since the parameter is
not part of the parameter list.
Fix this by changing it to a localparam and computing the log2() manually.
The upper limit for the burst length is known to be 4k, so values larger
than that don't have to be supported.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
A larger store-and-forward memory provides better protection against worst
case memory interface latencies by being able to store more data before
over-/underflowing.
Based on empirical testing it was found that using a size of 4 bursts can
still result in underflows/overflows under certain conditions. These do not
happen when using a size of 8 bursts.
This change does not significantly increase resource consumption. Both on
Intel and Xilinx the block RAM has a minimum depth of 512 entries. With a
default burst length of 16 beats that allows for up to 32 bursts without
requiring additional block RAM.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The label for the store-and-forward memory size configuration option at the
moment is just "FIFO Size" and while the store-and-forward memory uses a
FIFO that is just a implementation detail.
Change the label to "Store-and-Forward Memory Size". This is more
descriptive as it references the function not the implementation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For correct operation the store-and-forward memory size must be a
power-of-two in the range of 2 to 32.
This is simple enough so we can list all values and let the IP Integrator
and QSYS perform proper validation of the parameter.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This comment hasn't been true in a long long time. It does not have any
relation to the code around it anymore.
So just remove it.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
* jesd204: Add RX error statistics
Added 32 bit error counter per lane, register 0x308 + lane*0x20
On the control part added register 0x244 for performing counter reset and counter mask
Bit 0 resets the counter when set to 1
Bit 8 masks the disparity errors, when set to 1
Bit 9 masks the not in table errors when set to 1
Bit 10 masks the unexpected k errors, when set to 1
Unexpected K errors are counted when a character other than k28 is detected. The counter doesn't add errors when in CGS phase
Incremented version number
Commit e6aacd2f56 ("axi_dmac: Better support debug IDs when ID_WIDTH !=
3") managed to get the order of the IDs in the debug register wrong.
Restore the original order.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The cfg_links_disable register will mask the SYNC lines, disabled links
will always have a de-asserted SYNC (logic state HIGH).
The FSM will stay in CGS as long as there is one active link with an
asserted SYNC (logic state LOW).
Update the test bench to generate the SYNC signals in different clock
edges, so it can test all the possible scenarios.
A multi-link is a link where multiple converter devices are connected to a
single logic device (FPGA). All links involved in a multi-link are synchronous
and established at the same time. For a TX link this means that the FPGA receives
multiple SYNC signals, one for each link. The state machine of the TX link
peripheral must combine those SYNC signals into a single SYNC signal that is
asserted when either of the external SYNC signals is asserted.
Dynamic multi-link support must allow to select to which converter devices on
the multi-link the SYNC signal is propagated too. This is useful when depending
on the use case profile some converter devices are supposed to be disabled.
Add the cfg_links_disable[0x081] register for multi-link control and
propagate its value to the TX FSM.
A multi-link is a link where multiple converter devices are connected to a
single logic device (FPGA). All links involved in a multi-link are synchronous
and established at the same time. For a RX link this means that the SYNC signal
needs to be propagated from the FPGA to each converter.
Dynamic multi-link support must allow to select to which converter devices on
the multi-link the SYNC signal is propagated too. This is useful when depending
on the usecase profile some converter devices are supposed to be disabled.
Add the cfg_links_disable[0x081] register for multi-link control and
propagate its value to the RX FSM.