On 7/3/25 2:49 AM, Arnaud POULIQUEN wrote:
>
>
> On 7/2/25 19:00, Tanmay Shah wrote:
>>
>>
>> On 7/2/25 10:47 AM, Arnaud POULIQUEN wrote:
>>>
>>>
>>> On 7/2/25 17:23, Tanmay Shah wrote:
>>>>
>>>>
>>>> On 7/2/25 2:18 AM, Arnaud POULIQUEN wrote:
>>>>>
>>>>>
>>>>> On 7/1/25 23:19, Tanmay Shah wrote:
>>>>>>
>>>>>>
>>>>>> On 7/1/25 1:06 PM, Tanmay Shah wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 7/1/25 12:56 PM, Tanmay Shah wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/1/25 12:18 PM, Arnaud POULIQUEN wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 7/1/25 17:16, Tanmay Shah wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 7/1/25 3:07 AM, Arnaud POULIQUEN wrote:
>>>>>>>>>>> Hi Tanmay,
>>>>>>>>>>>
>>>>>>>>>>> On 6/27/25 23:29, Tanmay Shah wrote:
>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>
>>>>>>>>>>>> I am implementing remoteproc recovery on attach-detach use case.
>>>>>>>>>>>> I have implemented the feature in the platform driver, and it works for
>>>>>>>>>>>> boot
>>>>>>>>>>>> recovery.
>>>>>>>>>>>
>>>>>>>>>>> Few questions to better understand your use case.
>>>>>>>>>>>
>>>>>>>>>>> 1) The linux remoteproc firmware attach to a a remote processor, and you
>>>>>>>>>>> generate a crash of the remote processor, right?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes correct.
>>>>>>>>>>
>>>>>>>>>>> 1) How does the remoteprocessor reboot? On a remoteproc request or it
>>>>>>>>>>> is an
>>>>>>>>>>> autoreboot independent from the Linux core?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> It is auto-reboot independent from the linux core.
>>>>>>>>>>
>>>>>>>>>>> 2) In case of auto reboot, when does the remoteprocessor send an even to
>>>>>>>>>>> the
>>>>>>>>>>> Linux remoteproc driver ? beforeor after the reset?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Right now, when Remote reboots, it sends crash event to remoteproc driver
>>>>>>>>>> after
>>>>>>>>>> reboot.
>>>>>>>>>>
>>>>>>>>>>> 3) Do you expect to get core dump on crash?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> No coredump expected as of now, but only recovery. Eventually will
>>>>>>>>>> implement
>>>>>>>>>> coredump functionality as well.
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> However, I am stuck at the testing phase.
>>>>>>>>>>>>
>>>>>>>>>>>> When should firmware report the crash ? After reboot ? or during some
>>>>>>>>>>>> kind of
>>>>>>>>>>>> crash handler ?
>>>>>>>>>>>>
>>>>>>>>>>>> So far, I am reporting crash after rebooting remote processor, but it
>>>>>>>>>>>> doesn't
>>>>>>>>>>>> seem to work i.e. I don't see rpmsg devices created after recovery.>
>>>>>>>>>>>> What should be the correct process to test this feature ? How other
>>>>>>>>>>>> platforms
>>>>>>>>>>>> are testing this?
>>>>>>>>>>>
>>>>>>>>>>> I have never tested it on ST board. As a first analysis, in case of
>>>>>>>>>>> autoreboot
>>>>>>>>>>> of the remote processor, it look like you should detach and reattach to
>>>>>>>>>>> recover.
>>>>>>>>>>
>>>>>>>>>> That is what's done from the remoteproc framework.
>>>>>>>>>>
>>>>>>>>>>> - On detach the rpmsg devices should be unbind
>>>>>>>>>>> - On attach the remote processor should request RPmsg channels using
>>>>>>>>>>> the NS
>>>>>>>>>>> announcement mechanism
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Main issue is, Remote firmware needs to wait till all above happens. Then
>>>>>>>>>> only
>>>>>>>>>> initialize virtio devices. Currently we don't have any way to notify
>>>>>>>>>> recovery
>>>>>>>>>> progress from linux to remote fw in the remoteproc framework. So I might
>>>>>>>>>> have to
>>>>>>>>>> introduce some platform specific mechanism in remote firmware to wait for
>>>>>>>>>> recovery to complete successfully.
>>>>>>>>>
>>>>>>>>> I guess the rproc->clean_table contains a copy of the resource table
>>>>>>>>> that is
>>>>>>>>> reapplied on attach, and the virtio devices should be re-probed, right?
>>>>>>>>>
>>>>>>>>> During the virtio device probe, the vdev status in the resource table is
>>>>>>>>> updated
>>>>>>>>> to 7 when virtio is ready to communicate. Virtio should then call
>>>>>>>>> rproc_virtio_notify() to inform the remote processor of the status update.
>>>>>>>>> At this stage, your remoteproc driver should be able to send a mailbox
>>>>>>>>> message
>>>>>>>>> to inform the remote side about the recovery completion.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I think I spot the problem now.
>>>>>>>>
>>>>>>>> Linux side: file: remoteproc_core.c
>>>>>>>> rproc_attach_recovery
>>>>>>>> __rproc_detach
>>>>>>>> cleans up the resource table and re-loads it
>>>>>>>> __rproc_attach
>>>>>>>> stops and re-starts subdevices
>>>>>>>>
>>>>>>>>
>>>>>>>> Remote side:
>>>>>>>> Remote re-boots after crash
>>>>>>>> Detects crash happened previously
>>>>>>>> notify crash to Linux
>>>>>>>> (Linux is executing above flow meanwhile)
>>>>>>>> starts creating virtio devices
>>>>>>>> **rproc_virtio_create_vdev - parse vring & create vdev device**
>>>>>>>> **rproc_virtio_wait_remote_ready - wait for remote ready** [1]
>>>>>>>>
>>>>>>>> I think Remote should wait on DRIVER_OK bit, before creating virtio devices.
>>>>>>>> The temporary solution I implemented was to make sure vrings addresses are
>>>>>>>> not 0xffffffff like following:
>>>>>>>>
>>>>>>>> while(rsc->rpmsg_vring0.da == FW_RSC_U32_ADDR_ANY ||
>>>>>>>> rsc->rpmsg_vring1.da == FW_RSC_U32_ADDR_ANY) {
>>>>>>>> usleep(100);
>>>>>>>> metal_cache_invalidate(rsc, rproc->rsc_len);
>>>>>>>> }
>>>>>>>>
>>>>>>>> Above works, but I think better solution is to change sequence where remote
>>>>>>>> waits before creating virtio devices.
>>>>>>>
>>>>>>> I am sorry, I should have said, remote should wait before parsing and
>>>>>>> assigning vrings to virtio device.
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] https://github.com/OpenAMP/open-amp/
>>>>>>>> blob/391671ba24840833d882c1a75c5d7307703b1cf1/lib/remoteproc/
>>>>>>>> remoteproc.c#L994
>>>>>>>>
>>>>>>
>>>>>> Actually upon further checking, I think above code is okay. I see that
>>>>>> wait_remote_ready is called before vrings are setup on remote fw side.
>>>>>>
>>>>>> However, during recovery time on remote side, somehow I still have to
>>>>>> implement
>>>>>> platform specific wait for vrings to setup correctly.
>>>>>>
>>>>>> From linux side, DRIVER_OK bit is set before vrings are setup correctly.
>>>>>> Because of that, when remote firmware sets up wrong vring addresses and then
>>>>>> rpmsg channels are not created.
>>>>>>
>>>>>> I am investigating on this further.
>>>>>
>>>>> Do you reset the vdev status as requested by the virtio spec?
>>>>> https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html…
>>>>>
>>>>> Regards,
>>>>> Arnaud
>>>>>
>>>>
>>>> Yes I do. I am actually restoring deafult resource table on firmware side, which
>>>> will set rpmsg_vdev status to 0.
>>>>
>>>> However, when printing vrings right before wait_remote_ready, I see vrings are
>>>> not set correctly from linux side:
>>>>
>>>> `vring0 = 0xFFFFFFFF, vring1 = 0xFFFFFFFF`
>>>
>>> That makes sense if values corresponds to the initial values of the resource
>>> table
>>> rproc->clean_table should contain a copy of these initial values.
>>>
>>>>
>>>> However, the rproc state was still moved to attach when checked from remoteproc
>>>> sysfs.
>>>
>>> Does the rproc_handle_resources() is called before going back in attached state?
>>
>> You are right. I think __rproc_attach() isn't calling rproc_handle_resources().
>>
>> But recovery is supported by other platforms so I think recovery should work
>> without calling rproc_handle_resources().
>
> Right. Having taken a deeper look at the code, it seems that there is an issue.
> In rproc_reset_rsc_table_on_detach(), we clean the resource table without
> calling rproc_resource_cleanup().
>
> It seems to me that rproc_reset_rsc_table_on_detach() should not be called in
> __rproc_detach() but rather in rproc_detach() after calling
> rproc_resource_cleanup().
>
>
Yes that sounds correct. It's long-weekend here in US. So, I will try
this next week and update.
Thanks,
Tanmay
>>
>> May be re-storing resource table from firmware side after reboot isn't a good
>> idea. I will try without it.
>>
>>>
>>>>
>>>> `cat /sys/class/remoteproc/remoteproc0/state`
>>>> attached
>>>>
>>>> Somehow the sync between remote fw and linux isn't right.
>>>>
>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Tanmay
>>>>>>>>> Regards
>>>>>>>>> Arnaud
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Arnaud
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Tanmay
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>
On 7/2/25 10:47 AM, Arnaud POULIQUEN wrote:
>
>
> On 7/2/25 17:23, Tanmay Shah wrote:
>>
>>
>> On 7/2/25 2:18 AM, Arnaud POULIQUEN wrote:
>>>
>>>
>>> On 7/1/25 23:19, Tanmay Shah wrote:
>>>>
>>>>
>>>> On 7/1/25 1:06 PM, Tanmay Shah wrote:
>>>>>
>>>>>
>>>>> On 7/1/25 12:56 PM, Tanmay Shah wrote:
>>>>>>
>>>>>>
>>>>>> On 7/1/25 12:18 PM, Arnaud POULIQUEN wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 7/1/25 17:16, Tanmay Shah wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/1/25 3:07 AM, Arnaud POULIQUEN wrote:
>>>>>>>>> Hi Tanmay,
>>>>>>>>>
>>>>>>>>> On 6/27/25 23:29, Tanmay Shah wrote:
>>>>>>>>>> Hello all,
>>>>>>>>>>
>>>>>>>>>> I am implementing remoteproc recovery on attach-detach use case.
>>>>>>>>>> I have implemented the feature in the platform driver, and it works for
>>>>>>>>>> boot
>>>>>>>>>> recovery.
>>>>>>>>>
>>>>>>>>> Few questions to better understand your use case.
>>>>>>>>>
>>>>>>>>> 1) The linux remoteproc firmware attach to a a remote processor, and you
>>>>>>>>> generate a crash of the remote processor, right?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes correct.
>>>>>>>>
>>>>>>>>> 1) How does the remoteprocessor reboot? On a remoteproc request or it is an
>>>>>>>>> autoreboot independent from the Linux core?
>>>>>>>>>
>>>>>>>>
>>>>>>>> It is auto-reboot independent from the linux core.
>>>>>>>>
>>>>>>>>> 2) In case of auto reboot, when does the remoteprocessor send an even to
>>>>>>>>> the
>>>>>>>>> Linux remoteproc driver ? beforeor after the reset?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Right now, when Remote reboots, it sends crash event to remoteproc driver
>>>>>>>> after
>>>>>>>> reboot.
>>>>>>>>
>>>>>>>>> 3) Do you expect to get core dump on crash?
>>>>>>>>>
>>>>>>>>
>>>>>>>> No coredump expected as of now, but only recovery. Eventually will implement
>>>>>>>> coredump functionality as well.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> However, I am stuck at the testing phase.
>>>>>>>>>>
>>>>>>>>>> When should firmware report the crash ? After reboot ? or during some
>>>>>>>>>> kind of
>>>>>>>>>> crash handler ?
>>>>>>>>>>
>>>>>>>>>> So far, I am reporting crash after rebooting remote processor, but it
>>>>>>>>>> doesn't
>>>>>>>>>> seem to work i.e. I don't see rpmsg devices created after recovery.>
>>>>>>>>>> What should be the correct process to test this feature ? How other
>>>>>>>>>> platforms
>>>>>>>>>> are testing this?
>>>>>>>>>
>>>>>>>>> I have never tested it on ST board. As a first analysis, in case of
>>>>>>>>> autoreboot
>>>>>>>>> of the remote processor, it look like you should detach and reattach to
>>>>>>>>> recover.
>>>>>>>>
>>>>>>>> That is what's done from the remoteproc framework.
>>>>>>>>
>>>>>>>>> - On detach the rpmsg devices should be unbind
>>>>>>>>> - On attach the remote processor should request RPmsg channels using the NS
>>>>>>>>> announcement mechanism
>>>>>>>>>
>>>>>>>>
>>>>>>>> Main issue is, Remote firmware needs to wait till all above happens. Then
>>>>>>>> only
>>>>>>>> initialize virtio devices. Currently we don't have any way to notify
>>>>>>>> recovery
>>>>>>>> progress from linux to remote fw in the remoteproc framework. So I might
>>>>>>>> have to
>>>>>>>> introduce some platform specific mechanism in remote firmware to wait for
>>>>>>>> recovery to complete successfully.
>>>>>>>
>>>>>>> I guess the rproc->clean_table contains a copy of the resource table that is
>>>>>>> reapplied on attach, and the virtio devices should be re-probed, right?
>>>>>>>
>>>>>>> During the virtio device probe, the vdev status in the resource table is
>>>>>>> updated
>>>>>>> to 7 when virtio is ready to communicate. Virtio should then call
>>>>>>> rproc_virtio_notify() to inform the remote processor of the status update.
>>>>>>> At this stage, your remoteproc driver should be able to send a mailbox
>>>>>>> message
>>>>>>> to inform the remote side about the recovery completion.
>>>>>>>
>>>>>>
>>>>>> I think I spot the problem now.
>>>>>>
>>>>>> Linux side: file: remoteproc_core.c
>>>>>> rproc_attach_recovery
>>>>>> __rproc_detach
>>>>>> cleans up the resource table and re-loads it
>>>>>> __rproc_attach
>>>>>> stops and re-starts subdevices
>>>>>>
>>>>>>
>>>>>> Remote side:
>>>>>> Remote re-boots after crash
>>>>>> Detects crash happened previously
>>>>>> notify crash to Linux
>>>>>> (Linux is executing above flow meanwhile)
>>>>>> starts creating virtio devices
>>>>>> **rproc_virtio_create_vdev - parse vring & create vdev device**
>>>>>> **rproc_virtio_wait_remote_ready - wait for remote ready** [1]
>>>>>>
>>>>>> I think Remote should wait on DRIVER_OK bit, before creating virtio devices.
>>>>>> The temporary solution I implemented was to make sure vrings addresses are
>>>>>> not 0xffffffff like following:
>>>>>>
>>>>>> while(rsc->rpmsg_vring0.da == FW_RSC_U32_ADDR_ANY ||
>>>>>> rsc->rpmsg_vring1.da == FW_RSC_U32_ADDR_ANY) {
>>>>>> usleep(100);
>>>>>> metal_cache_invalidate(rsc, rproc->rsc_len);
>>>>>> }
>>>>>>
>>>>>> Above works, but I think better solution is to change sequence where remote
>>>>>> waits before creating virtio devices.
>>>>>
>>>>> I am sorry, I should have said, remote should wait before parsing and
>>>>> assigning vrings to virtio device.
>>>>>
>>>>>>
>>>>>>
>>>>>> [1] https://github.com/OpenAMP/open-amp/
>>>>>> blob/391671ba24840833d882c1a75c5d7307703b1cf1/lib/remoteproc/
>>>>>> remoteproc.c#L994
>>>>>>
>>>>
>>>> Actually upon further checking, I think above code is okay. I see that
>>>> wait_remote_ready is called before vrings are setup on remote fw side.
>>>>
>>>> However, during recovery time on remote side, somehow I still have to implement
>>>> platform specific wait for vrings to setup correctly.
>>>>
>>>> From linux side, DRIVER_OK bit is set before vrings are setup correctly.
>>>> Because of that, when remote firmware sets up wrong vring addresses and then
>>>> rpmsg channels are not created.
>>>>
>>>> I am investigating on this further.
>>>
>>> Do you reset the vdev status as requested by the virtio spec?
>>> https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html…
>>>
>>> Regards,
>>> Arnaud
>>>
>>
>> Yes I do. I am actually restoring deafult resource table on firmware side, which
>> will set rpmsg_vdev status to 0.
>>
>> However, when printing vrings right before wait_remote_ready, I see vrings are
>> not set correctly from linux side:
>>
>> `vring0 = 0xFFFFFFFF, vring1 = 0xFFFFFFFF`
>
> That makes sense if values corresponds to the initial values of the resource table
> rproc->clean_table should contain a copy of these initial values.
>
>>
>> However, the rproc state was still moved to attach when checked from remoteproc
>> sysfs.
>
> Does the rproc_handle_resources() is called before going back in attached state?
You are right. I think __rproc_attach() isn't calling
rproc_handle_resources().
But recovery is supported by other platforms so I think recovery should
work without calling rproc_handle_resources().
May be re-storing resource table from firmware side after reboot isn't a
good idea. I will try without it.
>
>>
>> `cat /sys/class/remoteproc/remoteproc0/state`
>> attached
>>
>> Somehow the sync between remote fw and linux isn't right.
>>
>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Tanmay
>>>>>>> Regards
>>>>>>> Arnaud
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Arnaud
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Tanmay
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>
On 7/2/25 2:18 AM, Arnaud POULIQUEN wrote:
>
>
> On 7/1/25 23:19, Tanmay Shah wrote:
>>
>>
>> On 7/1/25 1:06 PM, Tanmay Shah wrote:
>>>
>>>
>>> On 7/1/25 12:56 PM, Tanmay Shah wrote:
>>>>
>>>>
>>>> On 7/1/25 12:18 PM, Arnaud POULIQUEN wrote:
>>>>>
>>>>>
>>>>> On 7/1/25 17:16, Tanmay Shah wrote:
>>>>>>
>>>>>>
>>>>>> On 7/1/25 3:07 AM, Arnaud POULIQUEN wrote:
>>>>>>> Hi Tanmay,
>>>>>>>
>>>>>>> On 6/27/25 23:29, Tanmay Shah wrote:
>>>>>>>> Hello all,
>>>>>>>>
>>>>>>>> I am implementing remoteproc recovery on attach-detach use case.
>>>>>>>> I have implemented the feature in the platform driver, and it works for boot
>>>>>>>> recovery.
>>>>>>>
>>>>>>> Few questions to better understand your use case.
>>>>>>>
>>>>>>> 1) The linux remoteproc firmware attach to a a remote processor, and you
>>>>>>> generate a crash of the remote processor, right?
>>>>>>>
>>>>>>
>>>>>> Yes correct.
>>>>>>
>>>>>>> 1) How does the remoteprocessor reboot? On a remoteproc request or it is an
>>>>>>> autoreboot independent from the Linux core?
>>>>>>>
>>>>>>
>>>>>> It is auto-reboot independent from the linux core.
>>>>>>
>>>>>>> 2) In case of auto reboot, when does the remoteprocessor send an even to the
>>>>>>> Linux remoteproc driver ? beforeor after the reset?
>>>>>>>
>>>>>>
>>>>>> Right now, when Remote reboots, it sends crash event to remoteproc driver
>>>>>> after
>>>>>> reboot.
>>>>>>
>>>>>>> 3) Do you expect to get core dump on crash?
>>>>>>>
>>>>>>
>>>>>> No coredump expected as of now, but only recovery. Eventually will implement
>>>>>> coredump functionality as well.
>>>>>>
>>>>>>>>
>>>>>>>> However, I am stuck at the testing phase.
>>>>>>>>
>>>>>>>> When should firmware report the crash ? After reboot ? or during some
>>>>>>>> kind of
>>>>>>>> crash handler ?
>>>>>>>>
>>>>>>>> So far, I am reporting crash after rebooting remote processor, but it
>>>>>>>> doesn't
>>>>>>>> seem to work i.e. I don't see rpmsg devices created after recovery.>
>>>>>>>> What should be the correct process to test this feature ? How other
>>>>>>>> platforms
>>>>>>>> are testing this?
>>>>>>>
>>>>>>> I have never tested it on ST board. As a first analysis, in case of
>>>>>>> autoreboot
>>>>>>> of the remote processor, it look like you should detach and reattach to
>>>>>>> recover.
>>>>>>
>>>>>> That is what's done from the remoteproc framework.
>>>>>>
>>>>>>> - On detach the rpmsg devices should be unbind
>>>>>>> - On attach the remote processor should request RPmsg channels using the NS
>>>>>>> announcement mechanism
>>>>>>>
>>>>>>
>>>>>> Main issue is, Remote firmware needs to wait till all above happens. Then only
>>>>>> initialize virtio devices. Currently we don't have any way to notify recovery
>>>>>> progress from linux to remote fw in the remoteproc framework. So I might
>>>>>> have to
>>>>>> introduce some platform specific mechanism in remote firmware to wait for
>>>>>> recovery to complete successfully.
>>>>>
>>>>> I guess the rproc->clean_table contains a copy of the resource table that is
>>>>> reapplied on attach, and the virtio devices should be re-probed, right?
>>>>>
>>>>> During the virtio device probe, the vdev status in the resource table is
>>>>> updated
>>>>> to 7 when virtio is ready to communicate. Virtio should then call
>>>>> rproc_virtio_notify() to inform the remote processor of the status update.
>>>>> At this stage, your remoteproc driver should be able to send a mailbox message
>>>>> to inform the remote side about the recovery completion.
>>>>>
>>>>
>>>> I think I spot the problem now.
>>>>
>>>> Linux side: file: remoteproc_core.c
>>>> rproc_attach_recovery
>>>> __rproc_detach
>>>> cleans up the resource table and re-loads it
>>>> __rproc_attach
>>>> stops and re-starts subdevices
>>>>
>>>>
>>>> Remote side:
>>>> Remote re-boots after crash
>>>> Detects crash happened previously
>>>> notify crash to Linux
>>>> (Linux is executing above flow meanwhile)
>>>> starts creating virtio devices
>>>> **rproc_virtio_create_vdev - parse vring & create vdev device**
>>>> **rproc_virtio_wait_remote_ready - wait for remote ready** [1]
>>>>
>>>> I think Remote should wait on DRIVER_OK bit, before creating virtio devices.
>>>> The temporary solution I implemented was to make sure vrings addresses are
>>>> not 0xffffffff like following:
>>>>
>>>> while(rsc->rpmsg_vring0.da == FW_RSC_U32_ADDR_ANY ||
>>>> rsc->rpmsg_vring1.da == FW_RSC_U32_ADDR_ANY) {
>>>> usleep(100);
>>>> metal_cache_invalidate(rsc, rproc->rsc_len);
>>>> }
>>>>
>>>> Above works, but I think better solution is to change sequence where remote
>>>> waits before creating virtio devices.
>>>
>>> I am sorry, I should have said, remote should wait before parsing and
>>> assigning vrings to virtio device.
>>>
>>>>
>>>>
>>>> [1] https://github.com/OpenAMP/open-amp/
>>>> blob/391671ba24840833d882c1a75c5d7307703b1cf1/lib/remoteproc/ remoteproc.c#L994
>>>>
>>
>> Actually upon further checking, I think above code is okay. I see that
>> wait_remote_ready is called before vrings are setup on remote fw side.
>>
>> However, during recovery time on remote side, somehow I still have to implement
>> platform specific wait for vrings to setup correctly.
>>
>> From linux side, DRIVER_OK bit is set before vrings are setup correctly.
>> Because of that, when remote firmware sets up wrong vring addresses and then
>> rpmsg channels are not created.
>>
>> I am investigating on this further.
>
> Do you reset the vdev status as requested by the virtio spec?
> https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html…
>
> Regards,
> Arnaud
>
Yes I do. I am actually restoring deafult resource table on firmware
side, which will set rpmsg_vdev status to 0.
However, when printing vrings right before wait_remote_ready, I see
vrings are not set correctly from linux side:
`vring0 = 0xFFFFFFFF, vring1 = 0xFFFFFFFF`
However, the rproc state was still moved to attach when checked from
remoteproc sysfs.
`cat /sys/class/remoteproc/remoteproc0/state`
attached
Somehow the sync between remote fw and linux isn't right.
>>
>>>>
>>>> Thanks,
>>>> Tanmay
>>>>> Regards
>>>>> Arnaud
>>>>>
>>>>>
>>>>>>
>>>>>>> Regards,
>>>>>>> Arnaud
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Tanmay
>>>>>>
>>>>
>>>
>>
On 7/1/25 12:18 PM, Arnaud POULIQUEN wrote:
>
>
> On 7/1/25 17:16, Tanmay Shah wrote:
>>
>>
>> On 7/1/25 3:07 AM, Arnaud POULIQUEN wrote:
>>> Hi Tanmay,
>>>
>>> On 6/27/25 23:29, Tanmay Shah wrote:
>>>> Hello all,
>>>>
>>>> I am implementing remoteproc recovery on attach-detach use case.
>>>> I have implemented the feature in the platform driver, and it works for boot
>>>> recovery.
>>>
>>> Few questions to better understand your use case.
>>>
>>> 1) The linux remoteproc firmware attach to a a remote processor, and you
>>> generate a crash of the remote processor, right?
>>>
>>
>> Yes correct.
>>
>>> 1) How does the remoteprocessor reboot? On a remoteproc request or it is an
>>> autoreboot independent from the Linux core?
>>>
>>
>> It is auto-reboot independent from the linux core.
>>
>>> 2) In case of auto reboot, when does the remoteprocessor send an even to the
>>> Linux remoteproc driver ? beforeor after the reset?
>>>
>>
>> Right now, when Remote reboots, it sends crash event to remoteproc driver after
>> reboot.
>>
>>> 3) Do you expect to get core dump on crash?
>>>
>>
>> No coredump expected as of now, but only recovery. Eventually will implement
>> coredump functionality as well.
>>
>>>>
>>>> However, I am stuck at the testing phase.
>>>>
>>>> When should firmware report the crash ? After reboot ? or during some kind of
>>>> crash handler ?
>>>>
>>>> So far, I am reporting crash after rebooting remote processor, but it doesn't
>>>> seem to work i.e. I don't see rpmsg devices created after recovery.>
>>>> What should be the correct process to test this feature ? How other platforms
>>>> are testing this?
>>>
>>> I have never tested it on ST board. As a first analysis, in case of autoreboot
>>> of the remote processor, it look like you should detach and reattach to recover.
>>
>> That is what's done from the remoteproc framework.
>>
>>> - On detach the rpmsg devices should be unbind
>>> - On attach the remote processor should request RPmsg channels using the NS
>>> announcement mechanism
>>>
>>
>> Main issue is, Remote firmware needs to wait till all above happens. Then only
>> initialize virtio devices. Currently we don't have any way to notify recovery
>> progress from linux to remote fw in the remoteproc framework. So I might have to
>> introduce some platform specific mechanism in remote firmware to wait for
>> recovery to complete successfully.
>
> I guess the rproc->clean_table contains a copy of the resource table that is
> reapplied on attach, and the virtio devices should be re-probed, right?
>
> During the virtio device probe, the vdev status in the resource table is updated
> to 7 when virtio is ready to communicate. Virtio should then call
> rproc_virtio_notify() to inform the remote processor of the status update.
> At this stage, your remoteproc driver should be able to send a mailbox message
> to inform the remote side about the recovery completion.
>
I think I spot the problem now.
Linux side: file: remoteproc_core.c
rproc_attach_recovery
__rproc_detach
cleans up the resource table and re-loads it
__rproc_attach
stops and re-starts subdevices
Remote side:
Remote re-boots after crash
Detects crash happened previously
notify crash to Linux
(Linux is executing above flow meanwhile)
starts creating virtio devices
**rproc_virtio_create_vdev - parse vring & create vdev device**
**rproc_virtio_wait_remote_ready - wait for remote ready** [1]
I think Remote should wait on DRIVER_OK bit, before creating virtio
devices. The temporary solution I implemented was to make sure vrings
addresses are not 0xffffffff like following:
while(rsc->rpmsg_vring0.da == FW_RSC_U32_ADDR_ANY ||
rsc->rpmsg_vring1.da == FW_RSC_U32_ADDR_ANY) {
usleep(100);
metal_cache_invalidate(rsc, rproc->rsc_len);
}
Above works, but I think better solution is to change sequence where
remote waits before creating virtio devices.
[1]
https://github.com/OpenAMP/open-amp/blob/391671ba24840833d882c1a75c5d730770…
Thanks,
Tanmay
> Regards
> Arnaud
>
>
>>
>>> Regards,
>>> Arnaud
>>>
>>>>
>>>>
>>>> Thanks,
>>>> Tanmay
>>
On 7/1/25 3:07 AM, Arnaud POULIQUEN wrote:
> Hi Tanmay,
>
> On 6/27/25 23:29, Tanmay Shah wrote:
>> Hello all,
>>
>> I am implementing remoteproc recovery on attach-detach use case.
>> I have implemented the feature in the platform driver, and it works for boot
>> recovery.
>
> Few questions to better understand your use case.
>
> 1) The linux remoteproc firmware attach to a a remote processor, and you
> generate a crash of the remote processor, right?
>
Yes correct.
> 1) How does the remoteprocessor reboot? On a remoteproc request or it is an
> autoreboot independent from the Linux core?
>
It is auto-reboot independent from the linux core.
> 2) In case of auto reboot, when does the remoteprocessor send an even to the
> Linux remoteproc driver ? beforeor after the reset?
>
Right now, when Remote reboots, it sends crash event to remoteproc
driver after reboot.
> 3) Do you expect to get core dump on crash?
>
No coredump expected as of now, but only recovery. Eventually will
implement coredump functionality as well.
>>
>> However, I am stuck at the testing phase.
>>
>> When should firmware report the crash ? After reboot ? or during some kind of
>> crash handler ?
>>
>> So far, I am reporting crash after rebooting remote processor, but it doesn't
>> seem to work i.e. I don't see rpmsg devices created after recovery.>
>> What should be the correct process to test this feature ? How other platforms
>> are testing this?
>
> I have never tested it on ST board. As a first analysis, in case of autoreboot
> of the remote processor, it look like you should detach and reattach to recover.
That is what's done from the remoteproc framework.
> - On detach the rpmsg devices should be unbind
> - On attach the remote processor should request RPmsg channels using the NS
> announcement mechanism
>
Main issue is, Remote firmware needs to wait till all above happens.
Then only initialize virtio devices. Currently we don't have any way to
notify recovery progress from linux to remote fw in the remoteproc
framework. So I might have to introduce some platform specific mechanism
in remote firmware to wait for recovery to complete successfully.
> Regards,
> Arnaud
>
>>
>>
>> Thanks,
>> Tanmay