Decomposition of matrices for CPLEX and machine le

I am dealing with big matrices and time to time my code ends with 'killed:9' message in my terminal. I'm working on Mac OSx.

A wise programmer tells me the problem in my code is liked to the stored matrix I am dealing with.

nn = 35000
dd = 35
XX = np.random.rand(nn,dd)
XX = XX.dot(XX.T)    #it should be faster than np.dot(XX,XX.T)
yy = np.random.rand(nn,1)
XX = np.multiply(XX,yy.T)

I have to store this huge matrix XX, my guess: I split the matrix with

upp = np.triu(XX)

Do I actually save space in terms of stored data? What if later on I store

low = app.T

am I wasting memory and computational time?

It should take up the same total amount of memory. To avoid the error you are probably looking at a few options:

Process batch wise If you create your model over the CPLEX API, once you supplied the data it is handled by CPLEX I believe. So you could split the data and load it piece by piece and add it to the model consecutively.
Allocate memory manually If you use Cython you can use the function malloc to allocate memory manually for your array, the size will very likely be no issue then.

Option 1 would be the preferred option in my opinion.

EDIT: I constructed a little example. It actually combines the two options. The array is not stored as a Python object, but as a C array and the values are computed piecewise. I am allocating the memory for the array using Cython and malloc. To run the code you have to install Cython.Then you can open a python interpreter at the directory you saved the file and write:

import pyximport;pyximport.install()
import nameofscript

An example for processing your array:

import numpy as np
from libc.stdlib cimport malloc # Allocate memory manually
from cython.parallel import prange # Parallel processing without GIL
dd = 35
# With cdef we can define C variables in Cython.
cdef double **XXN
cdef double y[35000]
cdef int i, j, nn
nn = 35000
# Allocate memory for the Matrix with 1.225 billion double elements
XXN = <double **>malloc(nn * sizeof(double *))
for i in range(nn):
    XXN[i] = <double *>malloc(nn * sizeof(double))

XX = np.random.rand(nn,dd)
for i in range(nn):
    for j in range(nn):
        # Compute the values for the new matrix element by element
        XXN[i][j] = XX[i].dot(XX[j].T)

# Multiply the new matrix with y column wise
for i in prange(nn, nogil=True, num_threads=4):
    for j in range(nn):
        XXN[i][j] = XXN[i][j] * y[i]

Save this file as nameofscript.pyx and run it as described above. I have briefly tested this script and it runs about half an hour on my machine. You can extend this script and use the result array XXN for your further computations. A little example for parallelization: I did not initialize y and did not assign any values. If you declare y as a C array, you can e. g. assign some values from python objects to fill it with values. Then, you can conduct the last multiplication without GIL, in a parallelized manner, as shown in the code sample.

Regarding computational efficiency: This is probably not the fastest way (which may be writing your code for the CPLEX C Interface entirely maybe), but it does not throw the memory error and does run in an acceptable time if you do not have to repeat this computation too often.