Move the subtraction outside of the always block. In this way we're not adding
an additional delay element on to the output of the differentiator,
which brakes the transfer function of the filter.
Common basic steps:
- Include/create infrastructure:
* Intel:
- require quartus::device package
- set_module_property VALIDATION_CALLBACK info_param_validate
* Xilinx
- add bd.tcl, containing init{} procedure. The init procedure will be
called when the IP will be instantiated into the block design.
- add to the xilinx_blockdiagram file group the bd.tcl and common_bd.tcl
- create GUI files
- add parameters in *_ip.tcl and *_hw.tcl (adi_add_auto_fpga_spec_params)
- add/propagate the info parameters through the IP verilog files
axi_clkgen
util_adxcvr
ad_ip_jesd204_tpl_adc
ad_ip_jesd204_tpl_dac
axi_ad5766
axi_ad6676
axi_ad9122
axi_ad9144
axi_ad9152
axi_ad9162
axi_ad9250
axi_ad9265
axi_ad9680
axi_ad9361
axi_ad9371
axi_adrv9009
axi_ad9739a
axi_ad9434
axi_ad9467
axi_ad9684
axi_ad9963
axi_ad9625
axi_ad9671
axi_hdmi_tx
axi_fmcadc5_sync
To prevent the case, when after an invalid configuration, the generated
output PWM signal is constant HIGH, change the counter to a
down-counter. In this way the pulse will be placed at the end of the
PWM period, and if the configured width value is higher than the
configured period the output signal will be constant LOW.
Write code to pipeline data path for better DSP utilization on the
color space conversion.
In the old method the addition operations were performed outside the
DSPs
For consistent simulation behavior it is recommended to annotate all source
files with a timescale. Add it to those where it is currently missing.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The perfect shuffle is a common operation in data processing. Add a shared
module that implements this operation.
Having this in a shared module rather than open-coding every instance makes
sure that there are clear and well defined semantics associated with the
operation that are the same each time. This should ease review, maintenance and
understanding of the code.
The perfect shuffle splits the input vector into NUM_GROUPS groups and then
each group in WORDS_PER_GROUP. The output vector consists of
WORDS_PER_GROUP groups and each group has NUM_GROUPS words. The data is
remapped, so that the i-th word of the j-th word in the output vector is
the j-th word of the i-th group of the input vector.
The inverse operation of the perfect shuffle is the perfect shuffle with
both parameters swapped.
I.e. [perfect_suffle B A [perfect_shuffle A B data]] == data
Examples:
NUM_GROUPS = 2, WORDS_PER_GROUP = 4
[A B C D a b c d] => [A a B b C c D d]
NUM_GROUPS = 4, WORDS_PER_GROUP = 2
[A a B b C c D d] => [A B C D a b c d]
NUM_GROUPS = 3, WORDS_PER_GROUP = 2
[A B a b 1 2] => [A a 1 B b 2]
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
If DDS_DW is equal to DDS_D_DW there is no signal truncation and
consequentially no rounding should be performed. But the check whether
rounding should be performed currently is for if DDS_DW is less or equal to
DDS_D_DW.
When both are equal C_T_WIDTH is 0. This results in the expression
'{(C_T_WIDTH){dds_data_int[DDS_D_DW-1]}};' being a 0 width signal. This is
not legal Verilog, but both the Intel and Xilinx tools seem to accept it
nevertheless.
But the iverilog simulation tools generates the following error:
ad_dds_2.v:102: error: Concatenation repeat may not be zero in this context.
Xilinx Vivado also generates the following warning:
WARNING: [Synth 8-693] zero replication count - replication ignored [ad_dds_2.v:102]
Change the condition so that truncation is only performed when DDS_DW is
less than DDS_D_DW. This fixes both the error and the warning.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This patch will fix the following critical warning, generated by Quartus:
"Critical Warning (18061): Ignored Power-Up Level option on the following
registers
Critical Warning (18010): Register ad_rst:i_core_rst_reg|rst_sync will power
up to High File: ad_rst.v Line: 50"
For a proper reset synchronization, the asynchronous reset signal should
be connected to the reset pins of the two synchronizer flop, and the
data input of the first flop should be connected to VCC.
In the first stage we're synchronizing just the reset de-assertion, avoiding
the scenario when different parts of the design are reseting at different time,
causing unwanted behaviours.
In the second stage we're synchronizing the reset assertion.
The module expects an ACTIVE_HIGH input reset signal, and provides an ACTIVE_LOW
(rstn) and an ACTIVE_HIGH (rst) synchronized reset output signal.
- remove reset logic
- add wait for dac valid logic
- rewrite sine concatenation on wires for different path width to
suppress warnings
- use computed atan LUT tables
The CORDIC has a selectable width range for phase and data of 8-24.
Regarding the width of phase and data, the wider they are the smaller
the precision loss when shifting but with the cost of more FPGA
utilization. The user must decide between precision and utilization.
The DDS_WD parameter is independent of CORDIC(CORDIC_DW) or
Polynomial(16bit), letting the user chose the output width.
Here we encounter two scenarios:
* DDS_DW < DDS data width - in this case, a fair rounding will be
implemented corresponding to the truncated bits
* DDS_DW > DDS data width - DDS out data left shift to get the
corresponding concatenation bits.
Update for the parametrized ad_mul module. This will scale
a selectable sine width in a multiplication module.
Rename the data and phase width parameters for legibility.
When the tool calculates the X value for different phase widths, we
get rounding errors for every width in the interval [8;24].
Depending on the width thess errors cause overflows or smaller amplitudes
of the sine waves.
The error is not linear nor proportional with the phase. To fix the issue
a simple aproximation was chosen.
Perform the shifting operation before addition/subtraction in a
rotation stage. In the previous method, the result of the arithmetic
operation was shifted and the outcome was presented to the next stage.
In this way, data connections will be reduced between pipeline stages
Add parameters:
- to select the sine generator (polynomial/CORDIC)
- to select the CORDIC data width(default 16)
Suppress the warnings generated when the DDS is disabled.
https://en.wikipedia.org/wiki/CORDIC
Configurable in/out data width (14,16,18,20);
The HDL implementation requires pipelines, resulting in a
data_width + 2 clock cycles delay between the phase input data and the
sine data. For this reason, a ddata (delay data) was propagated through
the pipeline stages to help in future use scenarios
The ADC DMA will never underflow and unsurprisingly the adc_dunf signal is
never used anywhere. It is very unlikely it will ever be used, so remove
it.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The DAC DMA will never overflow and unsurprisingly the dac_dovf signal is
never used anywhere. It is very unlikely it will ever be used, so remove
it.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Fix the following warnings that are generated by Quartus:
Warning (10230): Verilog HDL assignment warning at ad_sysref_gen.v(68): truncated value with size 32 to match size of target (8)
No functional changes.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Fix the following warnings that are generated by Quartus:
Warning (10036): Verilog HDL or VHDL warning at ad_datafmt.v(69): object "sign_s" assigned a value but never read
Move the sign_s and signext_s signals into the generate block in which
they are used.
No functional changes.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The DC filter implementation in library/common/dc_filter.v is Xilinx
specific as it uses the Xilinx DSP48 hard-macro. There is a matching Altera
specific implementation in library/altera/common/dc_filter.v.
Move the Xilinx specific implementation from the generic common folder to
the Xilinx specific common folder in library/xilinx/common/ since that is
where all other Xilinx specific common modules reside.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
In cases when a shallow FIFO is requested the synthesizer infers distributed RAM
instead of block RAMs. This can be an issue when the clocks of the FIFO are
asynchronous since a timing path is created though the LUTs which implement the
memory, resulting in timing failures. Ignoring timing through the path is not a
solution since would lead to metastability.
This does not happens with block RAMs.
The solution is to use the ad_mem (block RAM) in case of async clocks and letting
the synthesizer do it's job in case of sync clocks for optimal resource utilization.
This module upscale an n*sample_width data bus into a 16 or 32*n data
bus. The samples are right aligned and supports offset binary or two's
complement data format.
The up_rstn is driven by s_axi_resetn, which is generated by a
Processor System Reset module. (connected to port peripheral_aresetn)
Therefor using this reset signal as an asynchronous reset is redundant,
and a bad design practice at the same time. Asynchronous reset should be
used if it's inevitable.
The period_count should be updated once per clock cycle. This is not
enforced with the current implementation, which probably leads to
period_count being decremented on both m_axis_aclk edges.
A problem observed due to this is that the m_axis_tlast output is not
asserted or is asserted for a too short time for the consumer to
detect it.
Fix by letting the decrement (and thus the m_axis_tlast toggling)
happen only on the rising edge of the m_axis_aclk clock.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
At the moment the drain signal is always asserted when the controller is
enabled. This breaks backpressure and data is lost. The drain signal should
only be asserted when the controller gets disabled until the last beat of
the current DMA transfer.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The ADI transport layer peripherals expect the first octet to be in the
LSBs and the last octet to be in the MSBs. The Altera JESD204 core orders
the octets the other way around though, first octet in the MSBs and last
octet in the LSBS.
Currently this is handled by having each transport layer peripheral swap
the octets around when it is connected to the Altera JESD204 core.
Change this so that rather than having to do the data swizzling in every in
every transport layer peripheral perform it at the input/output of the link
layer peripheral inside the generated block.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
+ Add a HDL parameter for the PPS receiver module :
PPS_RECEIVER_ENABLE. By default the module is disabled.
+ Add the CMOS_OR_LVDS_N and PPS_RECEIVER_ENABLE into the CONFIG
register
+ Define a pps_status read only register, which will be asserted, if the free
running counter reach a certain fixed threshold. (2^28) The register can
be deasserted by an incomming PPS only.
The external s_axi_{awaddr,araddr} signals that are connect to the core
have their width set according to the specified size of the register map.
If the s_axi_{awaddr,araddr} signal of the core is wider (as it currently
is for many cores) the MSBs of those signals are left unconnected, which
generates a warning.
To avoid this make sure that the signal width matches the declared register
map size.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The ad_pps_receiver is instantiated at the top of core.
The rcounter is placed into adc/dac_common registers space, at the
address 0x30 (word aligned).
The interrupt mask is placed into adc/dac_common, at the address 0x04
(word aligned). Because the core has an instance of both modules, the
interrupt masks are OR-ed together.
Add a module to receive 1PPS signal from a GPS module. The module has a
free running counter, which runs on the device's interface clock. The
counter value is latched into a register each time when a 1PPS arrives.
An interrupt signal is also generated in every 1PPS.
The MSB of the d_count signal is used as a overflow marker to stop the
counter from incrementing in the monitored clock domain. It is not exported
through the register map and truncated when assigned to the up_d_count
signal.
Make the truncation explicit to make it clear that this is not a mistake
and to avoid warnings about implicit truncation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
All verilog file are using the Verilog-2001 standard to define
and/or declare ports. Definin a port width with a local parameter
is a bad practive, when this standard is used. Some simulators
will crash. Try to avoid it.
Move the CDC helper modules to a dedicated helper modules. This makes it
possible to reference them without having to use file paths that go outside
of the referencing project's directory.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The clock monitor reports the ratio of the clock frequencies of a known
reference clock and a monitored unknown clock. The frequency ratio is
reported in a 16.16 fixed-point format.
This means that it is possible to detect clocks that are 65535 times faster
than the reference clock. For a reference clock of 100 MHz that is 6.5 THz
and even if the reference clock is running at only 1 MHz it is still 65
GHz, a clock rate much faster than what we'd ever expect in a FPGA.
Add a configuration option to the clock monitor that allows to reduce the
number of integer bits of ratio. This allows to reduce the utilization
while still being able to cover all realistic clock frequencies.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently when the monitored clock stops the clock monitor retains the old
frequency ratio value and there is no way to detect that the clock has
stopped and the reported value is indistinguishable form a clock still
running at the right rate.
If a full iteration as elapsed on the monitoring side and there is no
indication that the counter on the monitored side has started running set
the reported clock ratio value to 0 to indicate that the clock has stopped.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the clock monitor features a hold register in the monitored clock
domain. This old register is used to store a instantaneous copy of the
counter register. The value in the old register is then transferred to the
monitoring domain. Since the counter is continuously counting it is not
possible to directly transfer it since that might result in inconsistent
data.
Instead stop the counter and hold the registers stable for a duration that
is long enough for the monitoring domain to correctly capture the value.
Once the value has been transferred the counter is reset and restarted for
the next iteration.
This allows to eliminate the hold register, which slightly reduces
utilization.
The externally visible behaviour is identical before and after the patch.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>