This reverts commit 4b1d9fc86b "axi_dmac: Modified in order to avoid
vivado crash".
Vivado no longer crashes and this structure is much more efficient when it
comes to resource usage and timing. The intention here is to create a 1-bit
memory that is N entries deep and not a N bit signal.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The burst_count signal and its derived signals are not used until the
burst_count has been explicitly initialized by loading a transfer. There is
no need to have a reset.
This reduces the fan-out of the reset signal.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The data path register of the 2d_transfer module are qualified by the
corresponding valid signal. Their content is not used until they have been
explicitly initialized. There is no need to reset them explicitly.
This reduces the fan-out of the reset signal.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
There is no need to reset the data path in the address generator. The
values of the register on the data path are not used until they have been
explicitly initialized. Removing the reset simplifies the structure and
reduces the fan-out of the reset signal.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Xilinx tools don't allow to use $clog2() when computing the value of a
localparam, even though it is valid Verilog.
For this reason a parameter was used for BYTES_PER_BURST_WIDTH so far. But
that generates warnings from both Quartus and Vivado since the parameter is
not part of the parameter list.
Fix this by changing it to a localparam and computing the log2() manually.
The upper limit for the burst length is known to be 4k, so values larger
than that don't have to be supported.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
A larger store-and-forward memory provides better protection against worst
case memory interface latencies by being able to store more data before
over-/underflowing.
Based on empirical testing it was found that using a size of 4 bursts can
still result in underflows/overflows under certain conditions. These do not
happen when using a size of 8 bursts.
This change does not significantly increase resource consumption. Both on
Intel and Xilinx the block RAM has a minimum depth of 512 entries. With a
default burst length of 16 beats that allows for up to 32 bursts without
requiring additional block RAM.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The label for the store-and-forward memory size configuration option at the
moment is just "FIFO Size" and while the store-and-forward memory uses a
FIFO that is just a implementation detail.
Change the label to "Store-and-Forward Memory Size". This is more
descriptive as it references the function not the implementation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For correct operation the store-and-forward memory size must be a
power-of-two in the range of 2 to 32.
This is simple enough so we can list all values and let the IP Integrator
and QSYS perform proper validation of the parameter.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This comment hasn't been true in a long long time. It does not have any
relation to the code around it anymore.
So just remove it.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Commit e6aacd2f56 ("axi_dmac: Better support debug IDs when ID_WIDTH !=
3") managed to get the order of the IDs in the debug register wrong.
Restore the original order.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Split the register map code into a separate sub-module instead of having it
as part of the top-level axi_dmac.v file.
This makes it easier to component test the register map behavior
independently from the DMA transfer logic.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
When the source and destination bus widths don't match a resize block is
inserted on the side of the narrower bus. This resize block can contain
partial data.
To ensure that there is no residual partial data is left in the resize
block after a transfer shutdown the resize block is reset when the DMA is
disabled.
Currently this is implemented by tying the reset signal of the resize block
to the enable signal of the DMA. This enable signal is only a indicator
though that the DMA should shutdown. For a proper shutdown outstanding
transactions still need to be completed.
The data that is in the resize block might be required to complete those
transactions. So performing the reset when the enable signal goes low can
lead to a situation where the DMA tries to complete a transaction but can't
do it because the data required to do so has been erased by resetting the
resize block. This leads to a dead lock and the system has to be rebooted
to recover from it.
To solve this use the sync_id signal to reset the resize block. The sync_id
signal will only be asserted when both the destination and source side
module have indicated that they are ready to be reset and there are no more
pending transactions.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The MAX_BYTES_PER_BURST option allows to configure the maximum bytes that
are part of a burst. This can be an arbitrary value.
At the same time there is a limit of how many bytes can be supported by the
memory buses. A AXI3 interface supports a maximum of 16 beats per burst
and a AXI4 interface supports a maximum of 256 beats per burst.
At the moment the it is possible to specify a MAX_BYTES_PER_BURST value
that exceeds what can be supported by the AXI memory-mapped bus. If that is
the case undefined behavior will occur and the DMAC will function
incorrectly.
To avoid this make sure that the MAX_BYTES_PER_BURST value does not exceed
the maximum that can be supported by the interfaces.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The width of the AXI burst length field depends on the AXI standard
version. For AXI3 the width is 4 bits allowing a maximum burst length of 16
beats, for AXI4 it is 8 bits wide allowing a maximum burst length of 256
beats.
At the moment the width of the length signals are determined by type of the
source AXI interface, even if the source interface type is not AXI. This
means if the source interface is set to AXI3 and the destination interface
is set to AXI4 the internal width of the signal for all interfaces will be
4 bits. This leads to a truncation of the destination bus length field,
which is supposed to be 8 bits.
If burst are generated that are longer than 16 beats the upper bits of the
length signal will be truncated. The result of this will be that the
external AXI slave interface (e.g. the DDR memory) and the internal logic
in the DMA disagree about burst length. The DMA will eventually lock up
when its internal buffers are full.
To avoid this issue have different configuration parameters for the source
and destination interface that configure the AXI bus length field width.
This way one of the interfaces can be configured for AXI3 and the other for
AXI4 without interfering with each other.
Fixes: commit 495d2f3056 ("axi_dmac: Propagate awlen/arlen width through the core")
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
When the DMAC is used in async clock domains the data FIFO instantiate
an ad_mem component to handle properly the clock crossing.
For Intel, this mode is used only in FMCJESDADC1 designs but without this
an error could appear in other projects too if the user reconfigures the core.
The set_false_path constraint targeted to the *ram* cells of the dmac
matched several intra clock domain paths where the timing analysis got
ignored resulting in intermitent data integrity issues.
Exposed AXI3 interface on the Intel version of the IP for UI and feature consistency.
Some of the signals that are defined as optional in the AMBA standard
are marked as mandatory in Qsys in case of AXI3. Because of this such signals
were added to the interface of the DMAC and driven with default values.
For Xilinx in order to keep existing behavior the newly added signals
are hidden from the interface.
New parameters are added to define the width of the AXI transaction IDs;
these are hidden from the UI; We can add them to the UI if the fixed size
of the IDs will cause port incompatibility issues.
The primary use-case of the DMA controller is in non-2D mode. Make this the
default, since allows projects to instantiate the controller with the
default configuration without having to explicitly disable 2D support.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the individual IP core dependencies are tracked inside the
library Makefile for Xilinx IPs and the project Makefiles only reference
the IP cores.
For Altera on the other hand the individual dependencies are tracked inside
the project Makefile. This leads to a lot of duplicated lists and also
means that the project Makefiles need to be regenerated when one of the IP
cores changes their files.
Change the Altera projects to a similar scheme than the Xilinx projects.
The projects themselves only reference the library as a whole as their
dependency while the library Makefile references the individual source
dependencies.
Since on Altera there is no target that has to be generated create a dummy
target called ".timestamp_altera" who's only purpose is to have a timestamp
that is greater or equal to the timestamp of all of the IP core files. This
means the project Makefile can have a dependency on this file and make sure
that the project will be rebuild if any of the files in the library
changes.
This patch contains quite a bit of churn, but hopefully it reduces the
amount of churn in the future when modifying Altera IP cores.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The include files are currently only implicitly added to the component file
list. Do it explicitly as this will make sure that they show up in the
generated Makefile dependency list.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This reduces the amount of boilerplate code that is present in these
Makefiles by a lot.
It also makes it possible to update the Makefile rules in future without
having to re-generate all the Makefiles.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Bundle the TLAST signal in with the other AXIS slave signals to enable
easier connection between AXIS devices that use TLAST
Signed-off-by: Matt Fornero <matt.fornero@mathworks.com>
Add some limit TLAST support for the streaming AXI source interface. An
asserted TLAST signal marks the end of a packet and the following data beat
is the first beat for the next packet.
Currently the DMAC does not support for completing a transfer before all
requested bytes have been transferred. So the way this limited TLAST
support is implemented is by filling the remainder of the buffer with 0x00.
While the DMAC is busy filling the buffer with zeros back-pressure is
asserted on the external streaming AXI interface by keeping TREADY
de-asserted.
The end of a buffer is marked by a transfer that has the last bit set in
the FLAGS control register.
In the future we might add support for transfer completion before all
requested bytes have been transferred.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The commit 6900c have added an additional register stage into the fifo read
data path, but the control signals (ready/valid/underflow) were not realigned
to the data. This can cause data lose or duplicated samples in some case.
Realign the control signals to the data.
The first attempt (f3daf0) faild miserably. When the data_req signal
from the device had more than 1 cycle of deassert state, because of the
added latency of the data stream, the device got 'zeros' too.
In this fix, the DMA will hold the valid data on the bus, between two
consecutive data request. The bus is reseted just after all the data
were sent out.
Reset the fifo_rd_data if the DMA does not have an active transfer.
Becasue all the DAC device cores are transfering the data from the FIFO
interface to the data interface without any validation signal, DMA needs to put
the data bus into a known state, to prevent the device core to send the
last known data again and again.
The current layout of the debug ID register assumes that the ID_WIDTH is 3.
Change things so that the padding 0 width depends on the ID_WIDTH
parameter so that we end up with the same register layout regardless of the
value of ID_WIDTH.
Also split things into two registers, this allows for an ID_WIDTH up to 8
(which should hopefully be enough for all practical applications).
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The AXI specification that the minimum address space size is 4k, make sure
the axi_dmac adheres to this.
Internally the register space is still 2k. This means the upper and lower
2k of the axi4lite register space will map to the same internal registers.
Software must not rely on this and only access the lower 2k to enable
compatibility in case the internal space grows in the future.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Terminate the m_axi_list signal of the data mover instance in the
src_axi_stream module. This avoids a warning about the port being
unconnected.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the axi_dmac_hw.tcl script does not create interfaces if they
are not used in the current configuration. This has the disadvantage that
the ports belonging to these interfaces are not included in the generated
HDL wrapper. Which will generate a fair bunch of warnings when synthesizing
the HDL.
Instead always generate all interfaces, but disable those that are not used
in the current configuration. This will make sure that the ports belonging
to these interfaces are properly tied-off in the generate wrapper HDL.
This reduces the amount of false positive warnings generated and makes it
easier to spot actual issues.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The DMAC currently doesn't support transfers where the length is not a
multiple of the bus width. When generating the wstrb signal we do pretend
though that we do and dynamically generate it based on the LSBs of the
transfer length.
Given that the other parts of the DMA don't support such transfers this is
unnecessary though. So remove it for now and replace it with a constant
expression where wstrb is always fully asserted.
The generated logic for the wstrb signal was quite terrible, so this
improves the timing of the core.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the read side of the src_response interface is not used. This
leads to warnings about signals that have a value assigned but are never
read.
To avoid this just comment out all signals that are related to the
src_response interface for now.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Make sure that the right hand side expression of assignments is not wider
than the target signal. This avoids warnings about implicit truncations.
None of these changes affect the behaviour, just fixes some warnings about
implicit signal truncation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The external s_axi_{awaddr,araddr} signals that are connect to the core
have their width set according to the specified size of the register map.
If the s_axi_{awaddr,araddr} signal of the core is wider (as it currently
is for many cores) the MSBs of those signals are left unconnected, which
generates a warning.
To avoid this make sure that the signal width matches the declared register
map size.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_dmac can issue up to FIFO_SIZE read and write requests in parallel.
This is done in order to maximize throughput and compensate for for
latency.
Set the {read,write}IssuingCapability properties accordingly on the AXI
master interfaces. Otherwise qsys might decide to insert bridges that
artificially limit the number of requests, which in turn might affect
performance.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Qsys allows to query to query the clock domain that is associated with a
clock input of a peripheral. This allows to automatically detect whether
the different clocks of the DMAC are asynchronous and CDC logic needs to be
inserted or not.
Auto-detection has the advantages that the configuration parameters don't
need to be set manually and the optional configuration will be choose
automatically. There is also less chance of error of leaving the settings
in a wrong configuration when e.g. the clock domains change.
In case the auto-detection should ever fail configuration options that
provide a manual overwrite are added as well.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Group configuration parameters by function, provide human readable labels
as well as specify the allowed ranges for each parameter.
This prevents accidental misconfiguration and also makes it easier to
inspect (or change) the configuration in the Qsys GUI.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Use the ad_ip_intf_s_axi helper function to create the axi4lite slave
interface for memory mapped peripherals. This slightly reduces the amount
of boilerplate code in the peripheral's *hw.tcl
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_dmac can issue up to FIFO_SIZE read and write requests in parallel.
This is done in order to maximize throughput and compensate for for
latency.
Set the {read,write}IssuingCapability properties accordingly on the AXI
master interfaces. Otherwise qsys might decide to insert bridges that
artificially limit the number of requests, which in turn might affect
performance.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Make sure the req_gen_valid and req_gen_ready signals are declared before
they are used. Strictly speaking the current code is correct and synthesis
correctly, but declaring the signals make the intentions of the code more
explicit.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Move the CDC helper modules to a dedicated helper modules. This makes it
possible to reference them without having to use file paths that go outside
of the referencing project's directory.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
All the hdl (verilog and vhdl) source files were updated. If a file did not
have any license, it was added into it. Files, which were generated by
a tool (like Matlab) or were took over from other source (like opencores.org),
were unchanged.
New license looks as follows:
Copyright 2014 - 2017 (c) Analog Devices, Inc. All rights reserved.
Each core or library found in this collection may have its own licensing terms.
The user should keep this in in mind while exploring these cores.
Redistribution and use in source and binary forms,
with or without modification of this file, are permitted under the terms of either
(at the option of the user):
1. The GNU General Public License version 2 as published by the
Free Software Foundation, which can be found in the top level directory, or at:
https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
OR
2. An ADI specific BSD license as noted in the top level directory, or on-line at:
https://github.com/analogdevicesinc/hdl/blob/dev/LICENSE
When a mapping has multiple address segments we need to consider all of
them to calculate the required address width.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The address width needs to be large enough to be able to address the
largest possible address. This means the in addition to the address segment
range the specified offset also needs to be considered to calculate the
address width.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
up_rdata is qualified by the up_rack signal. There is no need to reset it
since by the time the signal is read the reset value has already been
overwritten anyway.
Also gate the up_rdata registers if no read operation is in progress. In
this case any changes would be ignored anyway.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The AXI DMAC peripheral only uses 11-bit of the register map interface
address. Reducing the signal width to this value allows the scripts to
correctly infer the size of the register map.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the AXI address width of the DMA is always 32-bit. But not all
address spaces are so large that they require 32-bit to address all memory.
Extract the size of the address space that the DMA is connected too and
configure reduce the address size to the minimum required to address the
full address space.
This slightly reduces utilization.
If no mapped address space can be found the default of 32 bits is used for
the address.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The debug registers are useful during development but are rarely used in a
production design. Add a option that allows to disable them, this reduces
the resource utilization of the DMAC.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Depending on whether the core is configured for AXI4 or AXI3 mode the width
of the awlen/arlen signal is either 8 or 4 bit. At the moment this is only
considered in top-level module and all other modules use 8 bit internally.
This causes warnings about truncated signals in AXI3 mode, to resolve this
forward the width of the signal through the core.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
It seems that in the latest version a constant of "0" is no longer a valid
enablement dependency and "false" has be used instead.
Not setting the enablement dependency correctly results in the AXI port to
be assumed to be read-write rather than just read or write. This will
generate unnecessary logic for example in interconnects to which the DMA
controller is connected.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Add a human readable name and descriptor for the AXI DMAC core.This string
will appear in various places e.g. like the IP catalog. This is a purely
cosmetic change.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Add a register to the AXI DMAC register map which functions has a
identification register. The register contains the unique value of "DMAC"
(0x444d4143) and allows software to identify whether the peripheral mapped
at a certain address is an axi_dmac peripheral.
This is useful for detecting cases where the specified address contains an
error or is incorrect.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This patch is a complementary fix of 8b8c37 patch. And fix
all the 'infer interface' issues.
The adi_ip_infer_interfaces process was renamed to
adi_ip_infer_streaming_interfaces. Now the process just do
what its name suggest.
Affected cores were axi_dmac, axi_spdif_rx, axi_spdif_tx, axi_i2s_adi
and axi_usb_fx3. All these cores scripts were updated.
Replace "PRIMITIVE_SUBGROUP == flop" with "IS_SEQUENTIAL" as the former is
series7 specific while the later works on all platforms. This fixes the
axi_dmac timing constraints for ultrascale based platforms.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
When all clocks are synchronous there are no synchronizers and the
constraint for the CDC registers can't find any cells which generates a
warning. To avoid this don't add CDC constraints when all the clocks are
synchronous.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For the AXI stream interface we want to generate TLAST only at the end of
the transfer, rather than at the end of each burst.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Conflicts:
library/axi_ad9361/axi_ad9361_ip.tcl
library/axi_dmac/Makefile
library/axi_dmac/axi_dmac_constr.ttcl
library/axi_dmac/axi_dmac_ip.tcl
library/common/ad_tdd_control.v
projects/daq2/common/daq2_bd.tcl
projects/fmcjesdadc1/common/fmcjesdadc1_bd.tcl
projects/fmcomms2/zc706pr/system_project.tcl
projects/fmcomms2/zc706pr/system_top.v
projects/usdrx1/common/usdrx1_bd.tcl
This merge was made, to recover any forgotten fixes from master,
before creating the new release branch. All conflicts were reviewed
and resolved.