/// (sometimes called features) and \f$(d^f_1,\dots,d^f_n)\f$ are the filter dimensions. It is required that for all \f$i\f$, \f$0 < l_i(d^f_i - 1) + 1 \le d_i\f$.
/// (sometimes called features) and \f$(d^f_1,\dots,d^f_n)\f$ are the filter dimensions. It is required that for all \f$i\f$, \f$0 < l_i(d^f_i - 1) + 1 \le d_i\f$.
/// (See below for the definition of the dilation \f$l_i\f$);
/// (See below for the definition of the dilation \f$l_i\f$);
///
///
/// and two optional parameters:
/// and four optional parameters:
///
///
/// 3. <i>(the window movement strides)</i> a vector of positive integers \f$(s_1,\dots,s_n)\f$, and
/// 3. <i>(the window movement strides)</i> a vector of positive integers \f$(s_1,\dots,s_n)\f$ (default is all ones),
/// 4. <i>(the window dilation strides)</i> a vector of positive integers \f$(l_1,\dots,l_n)\f$.
/// 4. <i>(the window dilation strides)</i> a vector of positive integers \f$(l_1,\dots,l_n)\f$ (default is all ones),
/// 5. <i>(the padding below)</i> a vector of non-negative integers \f$(p_1,\dots,p_n)\f$ (default is all zeros), and
/// 6. <i>(the padding above)</i> a vector of non-negative integers \f$(q_1,\dots,q_n)\f$ (default is all zeros).
///
///
/// Define the <i>physical window size</i> as the vector \f$(p_1,\dots,p_n)\f$ where \f$p_i = l_i(d^f_i - 1) + 1\f$.
/// The output has the shape \f$(N,C_\textit{out},d'_1,\dots,d'_n)\f$, where \f$d'_n = \lceil \frac{d_i + p_i + q_i - l_i(d^f_i - 1)}{s_i} \rceil\f$.
///
///
/// The output has the shape \f$(N,C_\textit{out},d'_1,\dots,d'_n)\f$, where \f$d'_n = \lceil \frac{d_i - p_i + 1}{s_i} \rceil\f$.
/// Given an input image batch tensor \f$T_\textit{in}\f$, first define the <i>padded input tensor</i> \f$T_\textit{pad}\f$, with shape \f$(N,C_\textit{in},d_1+p_1+q+1,\dots,d_n+p_n+q_n)\f$, as follows:
///
///
/// Given an input image batch tensor \f$T_\textit{in}\f$ and an input filter tensor \f$T_\textit{filt}\f$, the output tensor is defined by the equation (TODO: I'm sure
/// \f[
/// I messed something up here)
/// T_\textit{pad}[a,c,i_1,\dots,i_n] = T[a,c,i_1 - p_1,\dots,i_n - p_n] \text{ if for all }k, p_k \le i_k \lt p_k + d_k, \text{ else } 0
/// \f]
///
/// then, given an input filter tensor \f$T_\textit{filt}\f$, the output tensor \f$T_\textit{out}\f$ is defined by the equation.