Qsys allows to query to query the clock domain that is associated with a
clock input of a peripheral. This allows to automatically detect whether
the different clocks of the DMAC are asynchronous and CDC logic needs to be
inserted or not.
Auto-detection has the advantages that the configuration parameters don't
need to be set manually and the optional configuration will be choose
automatically. There is also less chance of error of leaving the settings
in a wrong configuration when e.g. the clock domains change.
In case the auto-detection should ever fail configuration options that
provide a manual overwrite are added as well.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Group configuration parameters by function, provide human readable labels
as well as specify the allowed ranges for each parameter.
This prevents accidental misconfiguration and also makes it easier to
inspect (or change) the configuration in the Qsys GUI.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
In this particular case the behaviour is the same with non-blocking and
blocking assignments, but that could change if the code is modified in the
future. To avoid any potentially issue due to this consistently use
non-blocking assignments.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Use the ad_ip_intf_s_axi helper function to create the axi4lite slave
interface for memory mapped peripherals. This slightly reduces the amount
of boilerplate code in the peripheral's *hw.tcl
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The address width of the AXI interface depends on the size of the register
and can differ from peripheral to peripheral. Add a parameter to the
function that allows to specify the address width, this allows to use the
function for more peripherals.
Keep the current value of 16 bits as the default if the parameter is not
specified.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_adxcvr register map only uses a single 4k page, make this explicit.
This will allow for tighter packaging in the limited available total
register map space.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This partially reverts commit a8ade15173.
Remove the nonsensical Makefile dependencies that got added by accident.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The MSB of the d_count signal is used as a overflow marker to stop the
counter from incrementing in the monitored clock domain. It is not exported
through the register map and truncated when assigned to the up_d_count
signal.
Make the truncation explicit to make it clear that this is not a mistake
and to avoid warnings about implicit truncation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The generic Altera clock monitor constraints expect the instance to be
called i_clock_mon. Adjust the code accordingly.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
In this particular case the behaviour is the same with non-blocking and
blocking assignments, but that could change if the code is modified in the
future. To avoid any potentially issue due to this consistently use
non-blocking assignments.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_dmac can issue up to FIFO_SIZE read and write requests in parallel.
This is done in order to maximize throughput and compensate for for
latency.
Set the {read,write}IssuingCapability properties accordingly on the AXI
master interfaces. Otherwise qsys might decide to insert bridges that
artificially limit the number of requests, which in turn might affect
performance.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The SYNC signal that gets reported through the status interface should be
the output (second stage) of the synchronizer circuit.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Make sure the core_cfg_transfer_en signal is declared before they are used.
Strictly speaking the current code is correct and synthesis correctly, but
declaring the signals make the intentions of the code more explicit.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Be more standard compliant and assign names to generate for-blocks. This is
required for Altera/Intel support.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Make sure the req_gen_valid and req_gen_ready signals are declared before
they are used. Strictly speaking the current code is correct and synthesis
correctly, but declaring the signals make the intentions of the code more
explicit.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
In some cases, the 'core_ilas_config_data' registers will be infered as
FDRE, instead of FDSE. Therefor a max delay definition, which are using
the S pin as its endpoint, it can become invalid, nonexistent.
Generalize the path, using the register itself as endpoint.
Increase the width of wvalid_counter, should be equal with awlen width.
The wvalid_counter needs to count from zero to the required burst
length. The maximum burst length is 255, so the width of the counter
have to be 8 bits. axi_last_beats will get the last axi burst length.
The fifo will ask for a new data from the DDR, if the current
level is lower than the high threshold. This will prevent overflow.
By deleting the lower threshold, we can avoid ocassional underflows,
when the DAC rate is closer to the max DDRx rate.
All verilog file are using the Verilog-2001 standard to define
and/or declare ports. Definin a port width with a local parameter
is a bad practive, when this standard is used. Some simulators
will crash. Try to avoid it.
Fix the dma_ready mux in top module, and the dma_ready_out reset
logic in axi_dacfifo_wr module. Also, both write and read addresses
of the async CDC fifo (inside the axi_dacfifo_wr) should be reset
before a dma transaction starts.
If the streaming bit is set, after the trigger condition is met
data will be continuosly captured by the DMA. The streaming bit
must be set to 0 to reset triggering.
If the streaming bit is set, after the trigger condition is met,
data will be continuosly captured by the DMA. The streaming bit
must be set to 0 to reset triggering.
In non-streaming mode we want direction changes to be applied immediately.
The current code has a typo and checks the wrong signal. overwrite_data
holds the configured output value of the pin, whereas overwrite_enable
configures whether the pin is in streaming or manual mode.
For correct operation the later signal should be used to decide whether a
direction change should be applied. Otherwise the direction change will
only be applied if the output value of the pin is set to logic high.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
When using non-broadcast access to the GT DRP registers lane filtering is
done on both sides. The ready and data signals are filtered in the in the
axi_adxcvr module and the enable signal is filtered in the util_adxcvr
module. This works fine as long as both sides use the same transceiver IDs.
E.g. channel 0 of the axi_adxcvr module is connected to channel 0 of the
util_adxcvr module.
But this is not always the case. E.g. on the ADRV9371 platform there are
two RX axi_adxcvr modules (RX and RX_OS) connected to the same util_adxcvr.
The first axi_adxcvr uses lane 0 and 1 of the util_adxcvr, the second uses
lane 2 and 3.
Non-broadcast access for the first RX axi_adxcvr module works fine, but
always generates a timeout for the second axi_adxcvr module. This is
because lane 0/1 of the axi_adxcvr module is connected to lane 2/3 of the
util_adxcvr and when ID based filtering is done both can't match at the
same time.
To avoid this perform the filtering for all the signals in the axi_adxcvr
module. This makes sure that the same base ID is used.
This also removes the sel signal from the transceiver interfaces since it
is no longer used on the util_adxcvr side.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Always explicitly specify the signal width for constants to avoid warnings
about signal width mismatch.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The buffer delay should be 0 in the default configuration. The current
value of 0xb must have slipped in by accident.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Use a single standalone counter that counts the number of beats since the
release of the SYNC~ signal, rather than re-using the LMFC counter plus a
dedicated multi-frame counter.
This is slightly simpler in terms of logic and also easier for software to
interpret the data.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
There are currently two sysref related events. One the sysref captured
event which is generated when an external sysref edge has been observed.
The other is the sysref alignment error event which is generated when a
sysref edge is observed that has a different alignment from previously
observed sysref edges.
Capture those events in the register map. This is useful for error
diagnostic. The events are sticky and write-1-to-clear.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The internal LMFC offset signals are in beats, whereas the register map is
in octets. Add the proper alignment padding to the register map to
translate between the two.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For SYSREF handling there are now three possible modes.
1) Disabled. In this mode the LMFC is generated internally and all external
SYSREF edges are ignored. This mode should be used for subclass 0 when no
external sysref is available.
2) Continuous SYSREF. An external SYSREF signal is required and the LMFC is
aligned to the SYSREF signal. The SYSREF signal is continuously monitored
and if a edge unaligned to the previous edges is detected the LMFC is
re-aligned to the new edge.
3) Oneshot SYSREF. Oneshot SYSREF mode is similar to continuous SYSREF mode
except only the first edge is captured and all further edges are ignored,
re-alignment will not happen.
Both in continuous and oneshot signal at least one external sysref edge is
required before an LMFC is generated. All events that require an LMFC will
be delayed until a SYSREF edge has been captured. This is done to avoid
accidental re-alignment.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
If the output pin is not defined as a clock, some of the Vivado IPI
propagation TCL will error out.
Signed-off-by: Matt Fornero <matt.fornero@mathworks.com>
By adding support for partial avalon transfers (data width < bus width),
valid data set size (DMA transfer length) will be dependent on the DMA bus
width only.
+ avl_write_transfer_done_s is a redundant net
+ specify the net state explicitly on if statements
+ to define the edge of avl_mem_fetch_wr_address signal,
its register and its second sync register should be used
The ad_mem_asym memory read interface has a 3 clock cycle delay, from the
moment of the address change until a valid data arrives on the bus;
because the dac_xfer_out is going to validate the outgoing samples (in conjunction
with the DAC VALID, which is free a running signal), this module will compensate
this delay, to prevent duplicated samples in the beginning of the
transaction.
+ all net names should have a *_s postfix
+ avl_burstcount is a constant 1, no need for an additional
register for it
+ all CDC should have two synchronization register, add
avl_last_beat_req_m2
The "'b0" constant will be translate as a 32 bit width vector by
ModelSim, and will throw a buswidth mismatch error. Tie the data_b
bus to zero, using its width parameter.
Currently the scripts use 'analog.com' as the vendor property for IP cores,
but 'ADI' for interfaces.
Make things consistent by using 'analog.com' for both interfaces as well
as IP cores.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Make sure that the XML files are re-build when any of the scripts that are
used to generated it are modified.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
All the rules to generate the XML files are the same. Reduce the number of
rules by useing wildcard matching for the rule target.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The ADI JESD204 link layer cores are a implementation of the JESD204 link
layer. They are responsible for handling the control signals (like SYNC and
SYSREF) and controlling the link state machine as well as performing
per-lane (de-)scrambling and character replacement.
Architecturally the cores are separated into two components.
1) Protocol processing cores (jesd204_rx, jesd204_tx). These cores take
care of the JESD204 protocol handling. They have configuration and status
ports that allows to configure their behaviour and monitor the current
state. The processing cores run entirely in the lane_rate/40 clock domain.
They have a upstream and a downstream port that accept and generate raw PHY
level data and transport level payload data (which is which depends on the
direction of the core).
2) Configuration interface cores (axi_jesd204_rx, axi_jesd204_tx). The
configuration interface cores provide a register map interface that allow
access to the to the configuration and status interfaces of the processing
cores. The configuration cores are responsible for implementing the clock
domain crossing between the lane_rate/40 and register map clock domain.
These new cores are compatible to all ADI converter products using the
JESD204 interface.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The sync_data module can be used to continuously transfer multi-bit signals
like status signals safely from the source to the destination clock
domain. A transfer takes 2 source and 2 destination clock cycles. It is not
guaranteed that all transitions on the source side will be visible on the
target side if the signal is changing faster than this. Logic using this
block should be aware of it. The primary intention is for it to be used for
slowly changing status signals.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The event synchronizer can be used to safely transfer 1-bit 1-clock cycle
event signals from one clock domain to another.
For each event recorded in the source domain it is guaranteed that a event
will be generated in the target domain at a later point in time. It is
possible though that multiple events in the source domain will be coalesced
into a single event in the target domain if events are generated faster
than they can be transferred.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Move the CDC helper modules to a dedicated helper modules. This makes it
possible to reference them without having to use file paths that go outside
of the referencing project's directory.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the name of the newly created IP core is automatically inferred
from the top-level module. This works fine if there is only one top-level
IP. But for an IP core that is a collection of helper modules this fails.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the polarity of the reset signal is always set to negative.
Change this so that the polarity is selected on the suffix of the name. If
it ends with a 'n' or 'N' the polarity will be negative, otherwise it will
be positive.
This allows this function to be used with reset signals that have positive
polarity.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This patch adds a helper function that allows to create multiple ports for
a single set of underlying signals. This is useful when the number of ports
is a configuration parameter. It sort of allows the emulation of port
arrays without having to have on set of input/output signals for each port,
instead the signals are shared by all ports.
The following snippet illustrates how this can for example be used to
generate multiple AXI-Streaming ports from a single set of signals.
<verilog>
module #(
parameter NUM_PORTS = 2
) (
input [NUM_PORTS*32-1:0] data,
input [NUM_PORTS-1:0] valid,
output [NUM_PORTS-1:0] ready,
);
...
endmodule
</verilog>
<tcl>
adi_add_multi_bus 8 "data" "slave" \
"xilinx.com:interface:axis_rtl:1.0" \
"xilinx.com:interface:axis:1.0" \
[list \
{ "data" "TDATA" 32} \
{ "valid" "TVALID" 1} \
{ "ready" "TREADY" 1} \
] \
"NUM_PORTS > {i})"
</tcl>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Commit 2f023437b4 ("adi_ip- remove adi_ip_constraints") changed the
default processing order of IP core constraint files from late to normal.
This is problematic because some IP core constraint files try to access
clocks that are that are generated by different files with the normal
processing order level. These clock may or may not be available to the IP
core constraint file depending on the (random) order in which the files
were processed.
To avoid this issue change the default processing order back to late.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The clock monitor reports the ratio of the clock frequencies of a known
reference clock and a monitored unknown clock. The frequency ratio is
reported in a 16.16 fixed-point format.
This means that it is possible to detect clocks that are 65535 times faster
than the reference clock. For a reference clock of 100 MHz that is 6.5 THz
and even if the reference clock is running at only 1 MHz it is still 65
GHz, a clock rate much faster than what we'd ever expect in a FPGA.
Add a configuration option to the clock monitor that allows to reduce the
number of integer bits of ratio. This allows to reduce the utilization
while still being able to cover all realistic clock frequencies.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently when the monitored clock stops the clock monitor retains the old
frequency ratio value and there is no way to detect that the clock has
stopped and the reported value is indistinguishable form a clock still
running at the right rate.
If a full iteration as elapsed on the monitoring side and there is no
indication that the counter on the monitored side has started running set
the reported clock ratio value to 0 to indicate that the clock has stopped.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the clock monitor features a hold register in the monitored clock
domain. This old register is used to store a instantaneous copy of the
counter register. The value in the old register is then transferred to the
monitoring domain. Since the counter is continuously counting it is not
possible to directly transfer it since that might result in inconsistent
data.
Instead stop the counter and hold the registers stable for a duration that
is long enough for the monitoring domain to correctly capture the value.
Once the value has been transferred the counter is reset and restarted for
the next iteration.
This allows to eliminate the hold register, which slightly reduces
utilization.
The externally visible behaviour is identical before and after the patch.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Make sure that the XML files are re-build when any of the scripts that are
used to generated it are modified.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
All the rules to generate the XML files are the same. Reduce the number of
rules by useing wildcard matching for the rule target.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Sort the entries in the library Makefile alphabetical. Keeping it ordered
makes it easier to track changes compared to randomly reshuffling it
every time a new entry is added.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
All the hdl (verilog and vhdl) source files were updated. If a file did not
have any license, it was added into it. Files, which were generated by
a tool (like Matlab) or were took over from other source (like opencores.org),
were unchanged.
New license looks as follows:
Copyright 2014 - 2017 (c) Analog Devices, Inc. All rights reserved.
Each core or library found in this collection may have its own licensing terms.
The user should keep this in in mind while exploring these cores.
Redistribution and use in source and binary forms,
with or without modification of this file, are permitted under the terms of either
(at the option of the user):
1. The GNU General Public License version 2 as published by the
Free Software Foundation, which can be found in the top level directory, or at:
https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
OR
2. An ADI specific BSD license as noted in the top level directory, or on-line at:
https://github.com/analogdevicesinc/hdl/blob/dev/LICENSE
There are devices which have a asynchronous data ready signal. (asynchronous
with the spi clock) The CDC stages can be enabled by setting up
the ASYNC_TRIG parameter.
In case of high precision devices with just a simple SPI interface
for control and data, the effective data rate can be significatly
lower than the SPI clock, and more importantly there isn't any relation
between the two clock domain.
The rate is defined by a SOT (start of transfer) generator, which
initiates a SPI transfer. Taking the fact that the generator runs
on system clock (100 MHz), and the device can require smaller rate (in kHz domain),
the 7 bit dac_datarate register is just too small.
Therefor increasing to 16 bit.
This core can be used in conjunction with the SPI_ENGINE, will work
as an offload module, forwarding a data stream to the SPI excecution,
received from a DMA.
Calculate the output clock frequencies based on the input clock frequencies
and the default divider settings and configure the output clock pins
accordingly. This allows connected peripherals to infer the frequency of
the clock.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Instead of having to manually specify the input clock period infer the
values from the block design. This means that less configuration parameters
need to be changed if the clock input frequency changes.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Add interface definition for the input and output clocks. This will allow
the tools to recognize them as clocks and enable things like clock
frequency propagation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The secondary clock inputs and outputs of the axi_clkgen are rarely used.
Add enable parameters that need to be explicitly set before they are
available. This allows to hide the secondary clock pins when they are not
used in the block design.
There are currently no projects which use the secondary clock inputs or
outputs so there is no need to set these new parameters anywhere.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Vivado infers the type of floating point type parameters as integer if the
value can be expressed as an integer (i.e. decimal places are 0). To
correctly infer them as floating point parameters add types to the
parameter declaration.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Can not be multiple 'if' statements inside a generate block. If there are
multiple cases use if/esle statement, but always should be one single
if/else inside a generate.
When a mapping has multiple address segments we need to consider all of
them to calculate the required address width.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The address width needs to be large enough to be able to address the
largest possible address. This means the in addition to the address segment
range the specified offset also needs to be considered to calculate the
address width.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
up_rdata is qualified by the up_rack signal. There is no need to reset it
since by the time the signal is read the reset value has already been
overwritten anyway.
Also gate the up_rdata registers if no read operation is in progress. In
this case any changes would be ignored anyway.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_adc_trigger does not use the full width of the AXI interface
address. It only responds to register access in the first 32 registers.
Reduce the size of the AXI address to 7 bit accordingly. This allows the
scripts to correctly infer the internal register map size which will cause
the interconnect to filter out access to these unused register.
This slightly reduces utilization by getting rid of some pipeline
registers.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_adc_decimate does not use the full width of the AXI interface
address. It only responds to register access in the first 32 registers.
Reduce the size of the AXI address to 7 bit accordingly. This allows the
scripts to correctly infer the internal register map size which will cause
the interconnect to filter out access to these unused register.
This slightly reduces utilization by getting rid of some pipeline
registers.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_dac_interpolate does not use the full width of the AXI interface
address. It only responds to register access in the first 32 registers.
Reduce the size of the AXI address to 7 bit accordingly. This allows the
scripts to correctly infer the internal register map size which will cause
the interconnect to filter out access to these unused register.
This slightly reduces utilization by getting rid of some pipeline
registers.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_logic_analyzer does not use the full width of the AXI interface
address. It only responds to register access in the first 32 registers.
Reduce the size of the AXI address to 7 bit accordingly. This allows the
scripts to correctly infer the internal register map size which will cause
the interconnect to filter out access to these unused register.
This slightly reduces utilization by getting rid of some pipeline
registers.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The AXI DMAC peripheral only uses 11-bit of the register map interface
address. Reducing the signal width to this value allows the scripts to
correctly infer the size of the register map.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Not all peripherals need the full address space. To be able to infer the
size of the address space of a peripheral allow the size of the AXI address
signals to be configurable rather than hardcoding its width to 32 bit.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the register map range of a peripheral is hardcoded to 64k. Not
all peripherals need that much space though and reducing the size of the
address can reduce the amount of logic required, both in the interconnect
as well as in the peripheral.
Let adi_ip_properties() infer the size of the register map from the number
of bits of the address when creating the register map.
For backwards compatibility limit the register map size to 64k since
currently peripherals have a address width of 32 bits, event if they use
less.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the AXI address width of the DMA is always 32-bit. But not all
address spaces are so large that they require 32-bit to address all memory.
Extract the size of the address space that the DMA is connected too and
configure reduce the address size to the minimum required to address the
full address space.
This slightly reduces utilization.
If no mapped address space can be found the default of 32 bits is used for
the address.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The delay_clk is only used internally when the IODELAYs are enabled. This
means the port has no function when the IODELAYs are disabled so hide the
port in that case.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Typically when a port has a enablement dependency it also should have a
tie-off value to the port is connected to when disabled.
Make it possible to specify this tie-off value when calling
adi_set_ports_dependency().
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The output data mux is used to bypass the filter when it is not used. Which
setting is used for the mux depends on the 3-bit filter_mask signal.
Registering the control logic into a single bit signal reduces the amount
of routing resources required. Since changing the filter_mask settings is
asynchronous to the processing anyway the extra clock cycle delay
introduced by this change does not affect behaviour.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Move the processing pipeline of the axi_adc_decimate core to its own
sub-module. This makes it easier to simulate the processing independent of
the register map.
Also since the filter is two instances of the same logic, one for each
channel, let the new sub-module model one channel and instantiate it twice.
This allows to change the implementation without having to change the same
code twice.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The output data of the decimation block is 16-bit signed. Properly sign
extend the 12-bit input signal when the filter is bypassed.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The minimum number of bits required for the adders in a CIC filter depends
on the decimation rate. Higher decimation factors require more bits. This
means for a multirate filter the size of the logic structures is determined
by the highest supported rate.
The current implementation of the filter always uses all bits of the
structure to compute the results, that means even when running with the
lowest decimation factor all the bits that are required for the highest
decimation factor are used. This will work fine as additional bits do not
affect the output of the filter.
This patch implements dynamic partial gating of the filter structure based
on the selected decimation factor. Bits that are not required for a certain
rates are gated and the carry bits are masked from propagating through the
adder chain. This results in significant power savings at smaller
decimation factors.
This means that the filter itself is now using more power the higher the
decimation rate. But this is offset by the reduced data output rate running
subsequent processing stages at a lower rate and reducing power consumption
there. This results in a more or less flat power profile regardless of
decimation factor.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Allow to split a CIC int or comb block into multiple stages and be able to
dynamically gate some of the stages. Also prevent carry propagation in
gated stages to keep the adder output constant.
This is useful for multi-rate filter where not all bits are needed all the
time.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The minimum decimation rate of the CIC block is five, this means data
arrives at the FIR filter at most every five clock cycles. The decimation
rate of the filter is two so the filter produces an output at most every
ten clock cycles. This allows for ten clock cycles to compute the result.
The current implementation of the filter uses a fully pipelined
architecture with one multiplier for each coefficient. Which then do work
for one clock cycle and sit idle for the next nine clock cycles.
Rework the filter to be sequential reducing the number of required
multipliers to one. In addition exploit the symmetric structure of the
filter to make use of the preadder reducing the required multiply
operations by two.
This significantly reduces the logic utilization of the filter as well as
moderately reduces power consumption.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The minimum decimation of the CIC block is 5. This means new data arrives
at the comb stages at most every 5 clock cycles. Rather than letting the
logic sit idle during those 4 extra cycles use it to sequentially process
the comb stages of the filter. This reduces the logic utilization of the
filter by quite a bit.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The output data mux is used to bypass the filter when it is not used. Which
setting is used for the mux depends on the 3-bit filter_mask signal.
Registering the control logic into a single bit signal reduces the amount
of routing resources required. Since changing the filter_mask settings is
asynchronous to the processing anyway the extra clock cycle delay
introduced by this change does not affect behaviour.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Re-implement the CIC using the basic building blocks from the util_cic
library.
This new implementation is structurally equivalent to the previous version,
but will be used as a platform for implementing changes that will improve
area and power consumtion of the filter
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Move the processing pipeline of the axi_adc_decimate core to its own
sub-module. This makes it easier to simulate the processing independent of
the register map.
The debug registers are useful during development but are rarely used in a
production design. Add a option that allows to disable them, this reduces
the resource utilization of the DMAC.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the BRAM and data registers in the util_axis_data are ungated
when the FIFO is ready to receive data. This good for high-performance
since it reduces the number of control signals. But it is bad from a power
point of view since it causes additional reads and writes.
Change the core gate the BRAM and data register if either the consumer is
not ready to accept data or the producer has no data to offer.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the IDDRs are configured in SAME_EDGE_PIPELINED mode, but then
the negative data is delayed by an additional clock cycle. This is the same
behaviour as using the IDDR in SAME_EDGE mode.
Switching to SAME_EDGE mode removes extra pipelining registers while
maintaining the same behaviour.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The current implementation doesn't quite work right when the interface
clock is slower than the trigger clock and also causes timing issues.
Disable it temporarily until a proper CDC transfer is implemented.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The read and write interfaces of a AXI bus are independent other than that
they use the same clock. Yet when connecting a single read-only and a
single write-only interface to a Xilinx AXI interconnect it instantiates
arbitration logic between the two interfaces. This is dead logic and
unnecessarily utilizes the FPGAs resources.
Introduce a new helper module that takes a read-only and a write-only AXI
interface and combines them into a single read-write interface. The only
restriction here is that all three interfaces need to use the same clock.
This module is useful for systems which feature a read DMA and a write DMA.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The register read logic is not that complicated that it needs two extra
pipeline stages. It can easily be condensed into a single combinatorial and
still meet timing with large margins.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Disable registers in the register map which are not needed for this core.
This reduces the utilization of the core.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Not all peripherals use the GPIO register settings, but the registers still
take up a fair amount of space in the register map. Add options to allow to
disable them when not needed. This helps to reduce the utilization for
peripherals where these features are not needed.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Not all peripherals use the GPIO and START_CODE register settings, but the
registers still take up a fair amount of space in the register map. Add
options to allow to disable them when not needed. This helps to reduce the
utilization for peripherals where these features are not needed.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Depending on whether the core is configured for AXI4 or AXI3 mode the width
of the awlen/arlen signal is either 8 or 4 bit. At the moment this is only
considered in top-level module and all other modules use 8 bit internally.
This causes warnings about truncated signals in AXI3 mode, to resolve this
forward the width of the signal through the core.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Declaring local parameters in the module parameter list is not valid
verilog. For some reasons Vivado accepts it nevertheless so the code has
worked so far. But this is not true for other tools, so move the local
parameter definitions inside the module body.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For experimentation, to solve a constraint scoping issue, split up the
ad_axi_ip_constraint file into separate constraints file, in function
of there parent module.
It seems that in the latest version a constant of "0" is no longer a valid
enablement dependency and "false" has be used instead.
Not setting the enablement dependency correctly results in the AXI port to
be assumed to be read-write rather than just read or write. This will
generate unnecessary logic for example in interconnects to which the DMA
controller is connected.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Xilinx recommends that all synchronizer flip-flops have
their ASYNC_REG property set to true in order to preserve the
synchronizer cells through any logic optimization during synthesis
and implementation.
Xilinx recommends that all synchronizer flip-flops have
their ASYNC_REG property set to true in order to preserve the
synchronizer cells through any logic optimization during synthesis
and implementation.
Add a human readable name and descriptor for the AXI DMAC core.This string
will appear in various places e.g. like the IP catalog. This is a purely
cosmetic change.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Add a register to the AXI DMAC register map which functions has a
identification register. The register contains the unique value of "DMAC"
(0x444d4143) and allows software to identify whether the peripheral mapped
at a certain address is an axi_dmac peripheral.
This is useful for detecting cases where the specified address contains an
error or is incorrect.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
- Change the clock and reset port name of the AXI slave interface
to s_axi_aclk and s_axi_aresetn. This way we can use the adi_ip_properties
process to infer the interface.
- Define an address space reference to the m_axi interface.
- Add a process, which automaticaly infer AXI memory mapped
interfaces (adi_ip_infer_mm_interfaces)
- Add missign line breaks to the 'set_propery supported_families'
command
- Fix the deletion of pre-infered memory maps
This patch is a complementary fix of 8b8c37 patch. And fix
all the 'infer interface' issues.
The adi_ip_infer_interfaces process was renamed to
adi_ip_infer_streaming_interfaces. Now the process just do
what its name suggest.
Affected cores were axi_dmac, axi_spdif_rx, axi_spdif_tx, axi_i2s_adi
and axi_usb_fx3. All these cores scripts were updated.
The SYSREF generator is using a simple free running counter,
which runs on the JESD204 core clock. The period can be
configured using a parameter, it must respect the constraints
defined by the JESD204 standard.
The generator can be enabled through a GPIO line.
The AXI Memory Map interface is infered in the adi_ip_properties process.
Infer it again in the adi_ip_infer_interfaces brakes the flow,
the tool will not find the cell's address segment, so there will not be
any address space assigned to the AXI interface.
Affected cores were axi_i2s_adi and axi_spdif_tx.
Start the counter_to_interrupt_cnt counter when the counter_to_interrupt
value is written to the register map. This gives applications better
control over when the counter starts counting.
Also start the counter_from_interrupt on the rising edge of the interrupt
signal to avoid bogus values.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The IRQ signal goes to a asynchronous domain. In order to avoid glitches to
be observed in that domain make sure that the output signal is fully
registered.
This means that the IRQ signal is no longer mask when the control enable
bit is not set. Instead modify the code to clear the interrupt when the
control enable bit is not set. This turns it into a true reset for the
internal state.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The I2S interface has a clock associated to it twice, this will generate a
critical warning when using the core, so remove one of them.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_jesd_gt was repleaced by axi_adxcvr IP, which is located
at library/xilinx and library/altera.
The axi_jesd_xcvr was an early version of axi_adxcvr.
The register map is moved to the IP's directory.
Add a tcl process, which can be used to generate custom module
names during the generation phase. This will be used to create
different ad_serdes_clk module, in case when independent IOPLLs are
needed for TX and RX.
Replace "PRIMITIVE_SUBGROUP == flop" with "IS_SEQUENTIAL" as the former is
series7 specific while the later works on all platforms. This fixes the
axi_dmac timing constraints for ultrascale based platforms.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
In modules ad_serdes_in/ad_serdes_out the handover of the parameter
SERDES_FACTOR did not exist, causing unwanted behavioral in case of
factors less than 8.
SERDES_FACTOR must be hand over to DATA_WIDTH parameter of the SERDES
primitive.
- trig signal will reset state machine
- slrd_n delay will be absorbed by the axi_usb_fx3_if module, when Xilinx DMA is not ready to receive data during a packet
- fx32dma_eop signals when the FX3 DMA buffer should be empty. slrd_n set high and sloe_n set low for another two clock cycles
- eot_fx32dma signals the interface that the packet has been fully transfered. No need for watermark signals
- added length_fx32dma and length_dma2fx3 as requested