I would try to double the nodes by requesting 68 nodes and only assign 6 tasks per node. This gives each task twice as much memory as in the successful configuration used for the 30 level grid. I would also try setting the namelist variable atm_pio_stride=6. This will put just 1 pio task on each node which should minimize the overhead incurred when writing the restart file.