Backward stochastic differential equations (BSDE) are known to be a powerful tool in mathematical modeling due to their inherent connection with second-order parabolic partial differential equations (PDE) established by the non-linear Feynman-Kac relations. The fundamental power of BSDEs lies in the fact that with them one does not merely obtain the solution of the corresponding PDE but also its spatial gradient through the control process Z. Classical numerical methods tackling the system face the so-called curse of dimensionality and cannot be used to solve high-dimensional problems. In recent years, multiple approaches have been developed to overcome this computational burden, building on deep learning and showing remarkable empirical success even beyond 10 dimensions. However, such Deep BSDE methods struggle with giving accurate approximations for the Z-process throughout the whole time horizon. In this thesis we propose a novel approach aimed to give better estimations for the control problem, exploiting the natural dynamics of the Z-process given by Malliavin calculus. The proposed methods use deep learning parametrizations taking advantage of the universal approximation capability of neural networks. The Malliavin derivatives are estimated through the Malliavin chain rule. Two discrete numerical methods are developed which are called One-Step Malliavin (OSM) and Multi-Step Malliavin (MSM) schemes respectively. An error analysis is carried out proving the consistency of the algorithms and showing first-order convergence under certain assumptions. Numerical experiments are presented to demonstrate the efficiency of the Malliavin formulation compared to other Deep BSDE solvers.