Background: I am running python with Theano on a GPU, and I care about speed.
Scenario: I have a largeish matrix (
C) which is stored as a shared variable, and I need to update a subset of the rows (
modified_rows) by some other matrix (
C_delta). What should I do?