Issues with memcpy on SAML21

Go To Last Post
6 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I have been chasing a really nasty bug and I think I have narrowed it down to memcpy. Is there some delay or asynchronous nature to memcpy that I should be aware of, perhaps specific to this architecture or compiler? My actual problem is that my freeRTOS streamBuffer seems to be not working, and different writes to it stomp on each other (example 2 below).

 

example 1: I use memcopy to clear my TX buffer before each iteration. This was to troubleshoot a different issue but I noticed that if I was debugging then everything was fine. BUT if i ran the system normally I would get nothing but zeros.

void DATA_task(void* pvParameters)
{
    int32_t err;
    (void)pvParameters;
    static DATA_TRANSMISSION_t usbPacket;
    static uint32_t idx;

    err = xStreamBufferSetTriggerLevel(xDATA_sb, PAGE_SIZE_LESS);

    /* Receive and write data forever. */
    for(;;)
    {
        /* Receive a page worth of data. */
        memset(usbPacket.data, 0, PAGE_SIZE_LESS);  // fixed typo: was using memset in my original test
        idx = 0;
        while(idx < PAGE_SIZE_LESS) {
            err = xStreamBufferReceive(xDATA_sb, (usbPacket.data + idx), (PAGE_SIZE_LESS - idx), portMAX_DELAY);
            idx += err;
        }

        /* Write data to USB if the appropriate flag is set. */
        if((xEventGroupGetBits(xSYSEVENTS_handle) & EVENT_LOGTOUSB) != 0)
        {
            // setup the packet header and CRC start value, then perform CRC32
            usbPacket.startSymbol = USB_PACKET_START_SYM;
            usbPacket.crc = 0xFFFFFFFF;
            crc_sync_crc32(&CRC_0, (uint32_t*)usbPacket.data, PAGE_SIZE_LESS/sizeof(uint32_t), &usbPacket.crc);

            // complement CRC to match standard CRC32 implementations
            usbPacket.crc ^= 0xFFFFFFFF;

            if((usb_state() == USB_Configured) && usb_dtr()) {
                err = usb_write(&usbPacket, sizeof(DATA_TRANSMISSION_t));
            }
        }
    }
}

Example 2: My real problem. I am using FreeRTOS and writing to a StreamBuffer twice from within the same task. It is an IMU so basically one write for a packet of acceleration data and one write for a packet of magnetic data. Somehow the packets are stomping on each other. I dug down into the FreeRTOS code and it looks like they use memcpy under the covers... I feel silly blaming my issues on a library function but I don't see anything else?

void IMU_task(void* pvParameters)
{
    const  TickType_t  xMaxBlockTime = pdMS_TO_TICKS( 3500 );     // max block time, set to slightly more than accelerometer ISR period
    static IMU_MSG_t   accMsg;          // data Packet for the accelerometer
    static IMU_MSG_t   magMsg;          // data Packet for the magnetometer
    BaseType_t  xResult;                // holds return value of blocking function
    int32_t     err = 0;                // for catching API errors
    uint32_t    ulNotifyValue;          // notification value from ISRs
    (void)pvParameters;

    // initialize the IMU
    err = lsm303_init(&I2C_IMU);
    err = lsm303_acc_startFIFO((((int32_t)pvParameters>>24)&0xFF), (((int32_t)pvParameters>>16)&0xFF));
    err = lsm303_mag_start((((int32_t)pvParameters>>8)&0xFF));
    lsm303_acc_motionDetectStart(MOTION_INT_X_HIGH, 250, 1);

    // enable the data ready interrupts
    ext_irq_register(IMU_INT1_XL, AccelerometerDataReadyISR);
    ext_irq_register(IMU_INT_MAG, MagnetometerDataReadyISR);
    ext_irq_register(IMU_INT2_XL, AccelerometerMotionISR);

    // initialize the message headers. Size of accMsg set at send time
    dataheader_init(&accMsg.header);
    accMsg.header.id       = DEVICE_ID_ACCELEROMETER;

    dataheader_init(&magMsg.header);
    magMsg.header.id       = DEVICE_ID_MAGNETIC_FIELD;
    magMsg.header.size     = sizeof(AxesRaw_t)*25;

    for(;;) {

        xResult = xTaskNotifyWait( pdFALSE,          /* Don't clear bits on entry. */
                                   ULONG_MAX,        /* Clear all bits on exit. */
                                   &ulNotifyValue,   /* Stores the notified value. */
                                   xMaxBlockTime );  /* Max time to block before writing an error packet */

        if( pdPASS == xResult ) {
            if( ACC_DATA_READY & ulNotifyValue ) {
                bool overrun;   // TODO: set overflow flag

                portENTER_CRITICAL();
                err = lsm303_acc_FIFOread(&accMsg.data[0], IMU_LOG_SIZE, &overrun);
                portEXIT_CRITICAL();
                if(overrun) {
                    gpio_set_pin_level(LED_GREEN, false);
                }
                if(err > (IMU_LOG_SIZE*6)) {
                    gpio_set_pin_level(LED_RED, false);
                }

                // set timestamp and set error flag if needed
                timestamp_FillHeader(&accMsg.header);
                if(err < 0) {
                    accMsg.header.id  |= DEVICE_ERR_COMMUNICATIONS;
                    accMsg.header.size = 0;
                    ctrlLog_write((uint8_t*)&accMsg, sizeof(DATA_HEADER_t));
                    accMsg.header.id &= ~(DEVICE_ERR_MASK);
                }
                else {
                    memset(accMsg.data, 'f', 25*6);
                    // TODO: make buffer slightly larger and have the log write calculate size from err and header size.
                    accMsg.header.size = err;   // the number of bytes read on last read
                    ctrlLog_write((uint8_t*)&accMsg, sizeof(IMU_MSG_t));
                }
            } // end of accelerometer state

            if( MAG_DATA_READY & ulNotifyValue ) {
                static uint_fast8_t magItr = 0;

                portENTER_CRITICAL();
                err = lsm303_mag_rawRead(&magMsg.data[magItr]);
                portEXIT_CRITICAL();
                if(err != ERR_NONE) {
                    magMsg.header.id |= DEVICE_ERR_COMMUNICATIONS;
                    magMsg.data[magItr].xAxis = 0xFF;
                    magMsg.data[magItr].yAxis = 0xFF;
                    magMsg.data[magItr].zAxis = 0xFF;
                }
                magItr++;

                if(magItr >= IMU_LOG_SIZE) {
                    memset(magMsg.data, 'l', 25*6);
                    timestamp_FillHeader(&magMsg.header);
                    ctrlLog_write((uint8_t*)&magMsg, sizeof(IMU_MSG_t));
                    magMsg.header.id &= ~(DEVICE_ERR_MASK);
                    magItr = 0;
                }
            }
        }
    } // END FOREVER LOOP
}

int32_t ctrlLog_write(uint8_t* buff, const uint32_t LEN)
{
    uint32_t err;

    portENTER_CRITICAL();
    if(xSemaphoreTake(DATA_mutex, portMAX_DELAY)) {
        // bail early if there isn't enough space
        if(xStreamBufferSpacesAvailable(xDATA_sb) >= LEN) {
            err = xStreamBufferSend(xDATA_sb, buff, LEN, 0);
            if(err < LEN) {
                gpio_set_pin_level(LED_RED, false);
            }
        }
        else {
            err = ERR_NO_RESOURCE;
            gpio_set_pin_level(LED_RED, false);
        }
        xSemaphoreGive(DATA_mutex);
        portEXIT_CRITICAL();
    }
    else {
        err = ERR_FAILURE;
        gpio_set_pin_level(LED_RED, false);
    }

    return err;
}

 

 

The result is the picture below, from my hex editor:

Last Edited: Tue. Jun 12, 2018 - 11:39 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Did you mean to use memset rather than memcpy? Otherwise you're copying data from address 0, which is probably not what you want.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

nice catch, But i was actually using memset() in my tests. It is no longer in my code though, so i typed the (wrong) line back in by hand here when I copied my code over. I fixed it but left a note.

 

So i guess my strange behavior is with memset and memcpy, with memcpy being used inside the freeRTOS functions to copy to the queue.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Not knowing the declaration of DATA_TRANSMISSION_t and IMU_MSG_t  it's impossible to know what is going on. But commonly such problems come from incorrect allocation. The magic numbers 25 and 25*6 look worrying, are there allocations matching this (then why not use sizeof or a #define)?

/Lars 

Last Edited: Wed. Jun 13, 2018 - 07:12 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The magic numbers are only in my debug code. The actual acceleration and magnetic data was hard to tell apart so I added the memset right before transmission to make sure the data is identifiable, without interfering with the collection. Here is the definitions of the two structs, and you can see where the 25 and 6 come from (25 samples per packet, each sample is 3 16-bit integers). The magnet data comes in much faster than the Acceleration (the accelerometer has a FIFO the magnetometer does not), so a data ready interrupt from the magnetometer could happen during the accelerometer code, but all the ISR does is send a notification to the task, and then the magnetometer block should run.

 

typedef struct __attribute__((__packed__)){
    uint16_t startSym;    // symbol to indicate start of packet
    uint8_t  id;	      // Upper four bits is the device ID, lower four are device specific event flags
    uint8_t  packetCount; // counter to number packets from a specific sensor in leu of milliseconds
    uint32_t timestamp;   // timestamp, seconds since reference year
    uint16_t size;		  // size of data packet to follow in bytes
} DATA_HEADER_t;

#define IMU_LOG_SIZE               (25)
typedef struct __attribute__((__packed__)){
    DATA_HEADER_t header;
    AxesRaw_t     data[IMU_LOG_SIZE];
} IMU_MSG_t;

#define PAGE_SIZE_LESS              (2048)
typedef struct __attribute__((__packed__)){
    uint32_t startSymbol;           // start symbol for the data transmission
    uint8_t  data[PAGE_SIZE_LESS];  // one page of data from flash
    uint32_t crc;                   // crc32 of the DATA (not the start symbol) using IEEE CRC32 polynomial
} DATA_TRANSMISSION_t;

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Still open to any ideas on this, but I changed the architecture of my design.

Originally I was relying on the FreeRTOS stream buffer watermark to block until it was hit. It seems there is a subtly to the docs that end up with the behavior that you will receive LESS than the watermark if there is data in the buffer when you call the read function. I only want complete 2Kb buffers (page size for flash) and occasionally I was getting pages data less than this.  Adding code to check the return size(amount of data read) and continue adding till the buffer was full fixes that problem:

 

/* Receive a page worth of data. */
idx = 0;
while(idx < PAGE_SIZE_LESS) {
    err = xStreamBufferReceive(xDATA_sb, (usbPacket.data + idx), (PAGE_SIZE_LESS - idx), portMAX_DELAY);
    idx += err;
}

But this did not fix the data stomping problem as I described above. That problem seems to have gone away by using the freeRTOS message buffer. The message buff is a stream buffer, but it will only accept an entire "packet" and it will either fail or succeed at writing the whole thing. So pretty much what I was trying to do anyways, but using some RAM in the buffer instead of my mutex and size checking code. It loads a structure into the buffer ahead of each packet that lets the read function know how big the next item is and the read function will fail if you don't give it an array large enough. I know how big all my packets are so I just made something that covers all cases. This made the write function much simpler:

int32_t ctrlLog_write(uint8_t* buff, const uint32_t LEN)
{
    portENTER_CRITICAL();
    uint32_t err = xMessageBufferSend(xDATA_sb, (void*)buff, LEN, 0);
    portEXIT_CRITICAL();
    return err;
}

and then I just have to use the code from my first code snippet to copy from each packet into the flash buffer and split across buffers as needed.

 

Seems to fix the data stomping problem so I guess that is great. But it is much less RAM efficient (which I am pretty tight on) since I need two arrays on the read side instead of one now, and the extra space in the queue to store each packets size. If I find out anything more I will post it here. Though I will probably look at our flash library to save some RAM since despite the overhead the message buffer is pretty cool and makes other parts of the design simpler.