CNC Services
This section describes the CNC services that are common to all counter blocks and clients.
The CNC unit contains a number of block of counters. Their number varies among different devices. Each block can be bound to one client. There are several clients that must be explicitly enabled for counting and then need to enable the port for counting per specific client.
For a common use of the CNC unit, the user must call the following APIs:
To bind a block to a client, call cpssDxChCncBlockClientEnableSet.
To set the range of block indexes, call cpssDxChCncBlockClientRangesSet.
To set the format of the indexes, call cpssDxChCncCounterFormatSet.
To enable the unit to use the CNC block (not relevant to every client), call cpssDxChCncCountingEnableSet.
To enable a port for using the CNC per client (not relevant to every client), call cpssDxChCncPortClientEnableSet.
In devices with multiple port-groups, such as BobCat3 and Falcon, counter blocks are maintained per port-group. Thus, to utilize all available counter blocks, configured them per port-group. Common APIs used for port-groups:
To bind a block to a client, call cpssDxChCncPortGroupBlockClientEnableSet.
To set block index range, call cpssDxChCncPortGroupBlockClientRangesSet.
To set index format, call cpssDxChCncPortGroupCounterFormatSet.
To clear or set a counter value, call cpssDxChCncPortGroupCounterSet.
Binding a Counter Block to a Client
The clients capable of using a counter block are defined by CPSS_DXCH_CNC_CLIENT_ENT.
See the device’s Functional Specifications and the Table Sizes and Resources addendum for a list the number of blocks and the number of counters per each block relevant to a specific device.
To bind/unbind a counter block to a specific client, call cpssDxChCncBlockClientEnableSet. For multiple port-group devices, the user can call cpssDxChCncPortGroupBlockClientEnableSet per port group.
For all non-Gen 6.5 devices, the same counter block must not be bound to more than one client. A counter block must first be unbound from its current client, and then bound to a new client. The application must enforced this restriction.
For Gen 6.5 devices, for better utilization of counters, a counter block can be bound to up to three clients if several counters per client are needed. Call cpssDxChCncBlockSharedClientConfigSet to set the (up to three) clients to bind to a counter block, as well as respective non-overlapping index ranges for each client. All clients sharing a block must use the same counter mode.
A client can bind more than one counter block
Counter Index Range Selection
Each block of counters has a certain number of entries and it varies from one device to another. Indicating the block and its first index is set by calling cpssDxChCncBlockClientRangesSet, or for multi port-group devices cpssDxChCncPortGroupBlockClientRangesSet. The ranges also vary among different devices but in general can be higher than the maximum number of counters. These APIs have the client as a parameter. In eArch devices, this parameter is ignored.
Example
A client, CPSS_DXCH_CNC_CLIENT_L2L3_INGRESS_VLAN_E, needs 2 blocks of counters. For this, the application allocates block 0 and block 1. Each block has:
xCat3/AlleyCat5 – 2048 entries
Lion2 – 512 entries
Gen5 devices and above– 1024 entries
The application calls cpssDxChCncBlockClientRangesSet twice—once for each block, with the indexRangesBmp parameter (of type GT_U64) set accordingly.
For the first block, use:
indexRangesBmp.l[0] = 1;
indexRangesBmp.l[1] = 0;
cpssDxChCncBlockClientRangesSet(devNum, blockNum=0, CPSS_DXCH_CNC_CLIENT_L2L3_INGRESS_VLAN_E, indexRangesBmp);
For the second block, use:
indexRangesBmp.l[0] = 2;
indexRangesBmp.l[1] = 0;
cpssDxChCncBlockClientRangesSet(devNum, blockNum=1, CPSS_DXCH_CNC_CLIENT_L2L3_INGRESS_VLAN_E, indexRangesBmp).
In this example, block 1 in is mapped to counter range 1024-2047.
Counter Entry Format
A CNC entry is 64 bits wide. These 64 bits are partitioned into 2 counters — one counts the packet and the other – the bytes. The user can chose the partitioning formats defined by CPSS_DXCH_CNC_COUNTER_FORMAT_ENT. CPSS refers to these formats as modes, with the difference being the size of the packets and bytes counters. Partition mode definitions are:
Partition Mode 0 allocates 29 bits for the packet counter, and 35 bits for the byte counter
Partition Mode 1 allocates 27 bits for the packet counter, and 37 bits for the byte counter
Partition Mode 2 allocates 37 bits for the packet counter, and 27 bits for the byte counter
Partition Mode 3 allocates 64 bits for the packet counter, and 0 bits for the byte counter
Partition Mode 4 allocates 0 bits for the packet counter, and 64 bits for the byte counter
Partition Mode 5 – A unique counter type, available only in Falcon devices, and queue/port statistics client.
Clients using counter format 5 must provide the CNC block with a statistic value, such as queue or port buffer utilization. The CNC entry stores the maximum value ever observed, and is available for application usage, as well as reset (to 0).
To set the format per block of counters, call cpssDxChCncCounterFormatSet, or for multi port-group devices cpssDxChCncPortGroupCounterFormatSet.
Byte Counter Mode
The byte counter can be configured to count the packets either with an L2 or L3 header. These two modes are defined by CPSS_DXCH_CNC_BYTE_COUNT_MODE_ENT, which can be applied by each client independently:
L2 mode – The byte counter counts the entire packet bytes for all packet types.
L3 mode – The byte counter counts the packet’s L3 fields (the entire packet minus the L3 offset) and only the passenger part for tunnel-terminated packets or tunnel-start packets.
To set the byte counting mode, call cpssDxChCncClientByteCountModeSet.
Read/Write Counter Entry
Each CNC entry contains 2 counters, a packet counter and a byte counter. It is possible to read the value of each counter in every counter entry, and sets its value as described in Write Counter Value.
For all devices except xCat3/AlleyCat5 – The CNC block has an arbiter between Read requests from host CPU and Write requests from device engines. Request from host CPU should have higher priority to avoid delay or starvation of host CPU. For that purpose, enable strict priority of CPU access to counter blocks by calling cpssDxChCncCpuAccessStrictPriorityEnableSet with enable set to GT_TRUE.
Read Counter Value
To read a counter, call cpssDxChCncCounterGet, or for multiple-port group devices cpssDxChCncPortGroupCounterGet.
Write Counter Value
For devices following AlleyCat3X, the write operation is coupled with read, known as clear-on-read.
If clear on read is not enabled, configure behavior upon warp-around as described in Counter Wrap Around.
You can globally set your system to clear-on-read CNC counter, to either 0 or non-0 value, by:
Enable the clear-by-read operation by calling cpssDxChCncCounterClearByReadEnableSet.
Set the global value for clear-on-read by calling cpssDxChCncCounterClearByReadValueSet.
The value is per counter format listed by CPSS_DXCH_CNC_COUNTER_FORMAT_ENT and setting the entry defined by the CPSS_DXCH_CNC_COUNTER_STC.
For devices up to AlleyCat3X – To set a value for a specific counter regardless of read operation, call cpssDxChCncCounterSet. For Gen4 multiple-port-group devices, call cpssDxChCncPortGroupCounterSet to set a counter for a specific port-group.
The last API gets the block number, the index of the counter, and the counter format. The index passed by the application is a function of the client. For every client, the index must be calculated differently. Tables in CNC Indexing Format show the counter indexing per client.
Counter Wrap Around
When a packet or byte counter reaches its maximum value, it either wraps around to zero and continues to count, or remains fixed with the maximum value until the user reads and clears it.
The wrap around is enabled by calling cpssDxChCncCounterWraparoundEnableSet with enable = GT_TRUE. This API is global and applies to all centralized counters.
The device maintains a wraparound status table for each counter block. This table keeps track of up to 8 counters that reached their maximum value.
To get the array of index counters that wrapped around, call cpssDxChCncCounterWraparoundIndexesGet. For multi-port group devices, call cpssDxChCncPortGroupCounterWraparoundIndexesGetto access a specific port group.
The API receives a pointer to an array with the maximal size of indexes (8) as an input, and returns the actual number of indexes that wrapped around as an output. After the wraparound status of an index is read, it is cleared from the wraparound status table.
Counter Block Upload
The device can be periodically triggered to upload an entire counter block to a pre-allocated host memory. The pre-allocated memory is the same memory optionally used by the Bridge FDB upload memory (FU queue). If the application uses both FU queue and uploading a block of counters, the pre-allocated memory must be determined by the larger size. Moreover, this memory can be used only by one of the engines (bridge or CNC) at a time.
To enable the CNC upload, the parameter fuqUseSeparate (in ppPhase2ParamsPtr) must be set to GT_TRUE. This is set by cpssDxChHwPpPhase2Init.
Since the device uses the same resource for both CNC and FDB upload queue, there are certain restrictions:
After triggering an FDB upload, but before starting a CNC upload, the application must retrieve all FU messages from the FDB upload queue by calling cpssDxChBrgFdbFuMsgBlockGet.
After starting a CNC upload, but before triggering an FDB upload, the application must retrieve all CNC messages from the FDB upload queue by calling cpssDxChCncUploadedBlockGet.
Yet, the CPSS driver uses some of this memory for descriptors management, so not all the CNC DMA memory used directly for the upload operation
For Caspian devices: Before the first call to cpssDxChCncBlockUploadTrigger for a block in unit, initialize CNC DMA queues by calling cpssDxChCncUploadInit per CNC unit. Calling cpssDxChCncUploadInit splits the CNC memory block evenly between all device port groups.
To trigger an upload of a given counter block, call cpssDxChCncBlockUploadTrigger. For multi port-group devices call cpssDxChCncPortGroupBlockUploadTrigger with a specific port group. The API verifies the memory is cleaned from CNC counters or FU messages from the previous triggering.
The application may sequentially trigger the upload of several CNC blocks before starting to retrieve uploaded counters.
To check if a CNC upload is finished, the application must call cpssDxChCncBlockUploadInProcessGet, or for multi-port group devices cpssDxChCncPortGroupBlockUploadInProcessGet. This API returns a bit map representing blocks in process as an output parameter.
If the n-th bit in inProcessBlocksBmpPtr is set, it means that uploading block n of the CNC counter is not finished yet.
To retrieve a block (array) of counter entries, call cpssDxChCncUploadedBlockGet. For multi port-group devices, call cpssDxChCncPortGroupUploadedBlockGet. The API returns the actual size of the array as an output parameter, as well as a pointer to the array itself.
The CNC upload transfers a whole CNC block to the FU queue. The application must retrieve all the transferred counters until the returned value is GT_NO_MORE. This ensures that all the entries were read.