Google Drive API – Uploading from MediaFileUpload using blob

I’m trying to upload a file (pdf) to Google Drive via the v3 API, I want to create the file out of an attachment from an email. All the docs that I’ve found seem to rely on a local file where as I want to upload a blob.

Here is the essence of my code so far:

class MailHandler(InboundMailHandler):
  def receive(self, mail_message):
    logging.info("Received a message from: " + mail_message.sender)
    if hasattr(mail_message, 'attachments'):
      logging.info("Has attachments")
      for filename, filecontents in mail_message.attachments:
        file_blob = filecontents.payload.decode(filecontents.encoding)
        credentials = UserModel.query().get()
        media = MediaFileUpload(filename, mimetype = 'image/jpg')
        file = drive.files().create( body = {'name': 'testupload.jpg'}, media_body = media)
        file.execute(c.credentials.authorize(http))

Which throws an error, due the file not existing, which it doesn’t because it is a blob.

Can somebody help me?

How to convert a opencv3 cv::Mat to a numpy array using ctypes (C++ to Python)?

I try to convert a opencv3 cv::Mat image in C++ to a Numpy array in python by using ctypes. The C++ side is a shared library that is reading the image from a shared memory region. The shared memory is working and is not relevant to this question.

extern "C" {
    unsigned char* read_data() {
        shd_mem_offset = region->get_address() + sizeof(sFrameHeader);
        unsigned char *frame_data = (unsigned char *) shd_mem_offset;
        return frame_data;
    }

    sFrameHeader *read_header() {
        sFrameHeader *frame_header;
        void *shd_mem_offset = region->get_address();
        frame_header = (sFrameHeader*)(shd_mem_offset);
        return frame_header;
    }

}

There are two functions declared as visible for ctypes. One is returning the cv::Mat attributes like cols, rows and step.
The other fucntion returning the cv::Mat data (mat.data).

The python side looks like this:

import ctypes
import numpy as np
from numpy.ctypeslib import ndpointer


class sFrameHeader(ctypes.Structure):
    _fields_ = [
        ("frame_size", ctypes.c_size_t),
        ("frame_mat_rows", ctypes.c_int),
        ("frame_mat_cols", ctypes.c_int),
        ("frame_mat_type", ctypes.c_int),
        ("frame_mat_step", ctypes.c_size_t)]

Lib = ctypes.cdll.LoadLibrary('interface.so')
Lib.read_header.restype = ctypes.POINTER(sFrameHeader)
header = deepva_module_interface.read_header()
frame_size = header.contents.frame_size
frame_mat_rows = header.contents.frame_mat_rows
frame_mat_cols = header.contents.frame_mat_cols
frame_mat_step = header.contents.frame_mat_step
frame_mat_type = header.contents.frame_mat_type

Lib.read_data.restype = ndpointer(dtype=ctypes.c_char, shape=(frame_size,))
data = Lib.read_data()

I want to display the image in python then…

cv2.imshow('image', data)
cv2.waitKey(0)
cv2.destroyAllWindows()

..but i think the shape of the numpy array is wrong. How can i correctly build the numpy array to be able to use it as OpenCV image in python?

Are there any solutions for converting it to numpy?
There are boost::python converters like here. I use Boost for shared memory but would like to stay with ctypes for the C binding stuff.

Using sklearn linear regression, how can I constrain the calculated regression coefficients to be greater than 0?

I’m using the reference for sklearn here http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html but there is no option to constrain the regression coefficients.

Does anyone know of another package in python to perform multiple variable linear regression and constrain the regression coefficients to be greater than 0?

Here is the code I have so far.

'''data:
date        A            B              C
10/30/2015  0.063363323 -0.005218807    0.079777558
11/30/2015  -0.013171244    -0.008727599    0.010352028
12/31/2015  -0.017551268    8.09E-05    -0.020491923
1/29/2016   -0.042606469    0.052272139 -0.080362246
2/29/2016   -0.015224562    0.031250961 0.029988488
3/31/2016   0.058291876 -0.000238614    0.056727336
4/29/2016   0.000505675 -0.005325338    0.02854057
5/31/2016   0.012766515 0.008548162 -0.001631845
6/30/2016   -0.038981203    0.064236963 0.00570145
7/29/2016   0.033715429 0.024269606 0.02703294
8/31/2016   -0.002083837    -0.009439625    0.004129397
9/30/2016   -0.009825674    -0.01737909 -0.019251885
11/30/2016  0.0084733   -0.11668582 0.031928726
12/30/2016  0.017084282 -0.005553088    0.029372131
1/31/2017   0.014263947 0.004036504 0.00187079
2/28/2017   0.037375566 0.016081105 0.039331615
3/31/2017   -0.002494984    -0.005942793    -0.002097504
4/28/2017   -0.005054922    0.015685226 0.008243977
5/31/2017   0.002285393 0.020771375 0.002697755
6/30/2017   0.002841457 0.004886117 0.019202011
7/31/2017   0.014866638 -0.006900926    0.010126577
8/31/2017   -0.016647997    0.035687133 -0.008709075
9/29/2017   0.019523651 -0.022154361    0.020468398
10/31/2017  0.019407629 -0.000705663    0.016574416
11/30/2017  0.027486425 0.008008173 0.033427299
12/29/2017  0.007861222 0.018095096 0.017908809
1/31/2018   0.058702838 -0.032765285    0.05
'''

reg = linear_model.LinearRegression(fit_intercept=False)
reg.fit(df[['B', 'C']], df['A'])

print(reg.coef_)

# [ 0.67761268 -0.08845756]

Working code below

from scipy.optimize import lsq_linear   

lb = 0
ub = np.Inf
res = lsq_linear(df[['B', 'C']], 
                 df['A'], 
                 bounds=(lb, ub))

print(res.x)

Why does gdalinfo –version show different outputs when run on terminal as compared to when run through python?

Why does gdalinfo –version show different outputs when run on terminal as compared to when run through python?

Terminal:

$ gdalinfo --version

GDAL 2.2.3, released 2017/11/20

Python:

import os
os.system('gdalinfo --version')

gives 6

bokeh twin axes not scaling well when dynamic adding Curves

I would like to make an app to explore some technical recording. My records have multiples columns, with different names, and may have many many lines. In my exploration, I only need to draw a few column, so I’m using a bokeh server and drawing and removing dynamically the column to visualize.

Because my data are sometimes voltage (mV) and sometimes temperature (°C), I can’t use a single scaling, that’s why I’m using twin scale plots. Thanks to some people on stackoverflow, I can have a good scaling if the curves are added to the plot before making the document.
But I need to add dynamically the curves, and I encounter some problems

When playing to add curve and removing them with the following code, I have strange behaviour : sometimes eveything is fine and scaling is good, sometimes scaling is not working and I have a curve not readable, sometime the plot is fine but the values on the scales are not right…
it seems sometimes that scaling is working but, display of the scale is failing.

Am I doing something wrong ? Do I ask to much to bokeh ? Is my application (adding/removing curve dynamically) not in the bokeh scope ?

from numpy import pi, arange, sin, linspace
from bokeh.plotting import curdoc
from bokeh.models import LinearAxis, DataRange1d, Plot, Title, PanTool, WheelZoomTool
from bokeh.models import Circle, ColumnDataSource
from bokeh.models.widgets import Button
from bokeh.layouts import row

# get some data
x = arange(-2*pi, 2*pi, 0.1)
y = sin(x)
y2 = linspace(0, 100, len(y))
cds = ColumnDataSource(data=dict(x=x, y=y, y2=y2))
c1 = Circle(x = "x", y = "y", line_color = "red", fill_color = "red")
c2 = Circle(x = "x", y = "y2", line_color = "blue", fill_color = "blue")

# configure plot
p = Plot(title = Title(text="Titre"), x_range = DataRange1d(), y_range = DataRange1d())
p.add_tools(PanTool(), WheelZoomTool())
p.add_layout(LinearAxis(), "below")
p.add_layout(LinearAxis(), "left")

# add extra y range
p.extra_y_ranges = {"y2": DataRange1d()}
p.add_layout(LinearAxis(y_range_name="y2"), 'right')

# set renderers to y_ranges
#p.y_range.renderers = []
#p.extra_y_ranges["y2"].renderers = []

c1_renderer = None
c1_drawed = False
c2_renderer = None
c2_drawed = False

def show_renderers():
    print("y principal : " + str(p.y_range.renderers))
    print("y secondair : " + str(p.extra_y_ranges["y2"].renderers))

# add buttons and callback
def draw_on_y1():
    global c1_renderer, c1_drawed
    if c1_drawed:
        # remove curve
        p.renderers.remove(c1_renderer)
        p.y_range.renderers.remove(c1_renderer)
        c1_drawed = False
    else:
        # add curve
        c1_renderer = p.add_glyph(cds, c1)
        p.y_range.renderers = [c1_renderer]
        c1_drawed = True
    show_renderers()

def draw_on_y2():
    global c2_renderer, c2_drawed
    if c2_drawed:
        # remove curve
        p.renderers.remove(c2_renderer)
        p.extra_y_ranges["y2"].renderers.remove(c2_renderer)
        c2_drawed = False
    else:
        # add curve
        c2_renderer = p.add_glyph(cds, c2, y_range_name = "y2")
        p.extra_y_ranges["y2"].renderers = [c2_renderer]
        c2_drawed = True
    show_renderers()

button_y1 = Button(label="draw on y1")
button_y1.on_click(draw_on_y1)
button_y2 = Button(label="draw on y2")
button_y2.on_click(draw_on_y2)

curdoc().add_root(row(p, button_y1, button_y2))

how to nest multiple python function to create a sort of pipeline?

Hello i have 3 functions f1(), f2() and f3(). The output of the previous is the input of the next. meaning output = f3(f2(f1(data))).

instead of writing

def outp(data):
    o1 = f1(data)
    o2= f2(o1)
    o3 = f3(02)
    return o3
output=outp(data)

is there a way to this by simply providing a list of functions to some other general function and let it handle the chaining together?

Kivy Clock stops after while on its own

I have simple OSD app made with python and kivy, running on raspberry. All it does it start clock that periodically pulls data from DB and update Label.text…

import kivy
kivy.require('1.10.1')

from kivy.app import App
from kivy.uix.boxlayout import BoxLayout
from kivy.uix.label import Label
from kivy.clock import Clock
import MySQLdb

class OSDBoxLayout(BoxLayout):

    def __init__(self, **kwargs):
        super(BoxLayout, self).__init__(**kwargs)
        Clock.schedule_interval(self.update_stats, 20)
        Clock.schedule_once(self.update_stats, 0.1)

    def update_stats(self, *args):
        try:
            self.active = []
            self.db = MySQLdb.connect(<DB connect info here>)

            self.cursor = self.db.cursor()
            self.cursor.execute('SET NAMES utf8;')
            self.cursor.execute('SET CHARACTER SET utf8;')
            self.cursor.execute('SET character_set_connection=utf8;')
            self.cursor.execute('SELECT domain FROM domain_info')

            self.total_domains = self.cursor.rowcount

            self.cursor.execute('SELECT domain FROM domain_list WHERE checked="" OR checked="Update"')

            self.for_update_domains = self.cursor.rowcount

            self.db.close()

            self.ids.marked_update.text=str(self.for_update_domains)

        except:
            pass


class osdApp(App):
    def build(self):
        self.title = 'OSD'
        return OSDBoxLayout()


if __name__ == '__main__':
    osdApp().run()

Evrything works fine, for a while… but after some time it stops… app does not crash, it still displays on LCD… but the update is no longer executed… I need to kill it and start again. No error in kivy log.

I suspect that the clock just hang, or have some default expiration ?

How can I extract the embeddings from an InfoGAN discriminator for a similarity search?

I’m trying to adopt a Keras InfoGAN implementation to essentially extract the transfer values (or embeddings) for an image I feed the discriminator. With this, I want to perform a similarity search with the resulting vectors to find the n most similar images to the one provided in the dataset.

I want to use Keras, so I’m looking at this implementation as a reference:

I found this TensorFlow 0.11 implementation where they provide functionality to achieve the similarity goal, but I’m having trouble trying to accomplish something similar in Keras.

I guess more simply I want to understand which layer would be best to take the transfer values from in the discriminator, and how I can do that in Keras with a trained model. The discriminator layers:

    x = Convolution2D(64, (4, 4), strides=(2,2))(self.d_input)
    x = LeakyReLU(0.1)(x)
    x = Convolution2D(128, (4, 4), strides=(2,2))(x)
    x = LeakyReLU(0.1)(x)
    x = BatchNormalization()(x)
    x = Flatten()(x)
    x = Dense(1024)(x)
    x = LeakyReLU(0.1)(x)
    self.d_hidden = BatchNormalization()(x) # Store this to set up Q
    self.d_output = Dense(1, activation='sigmoid', name='d_output')(self.d_hidden)

    self.discriminator = Model(inputs=[self.d_input], outputs=[self.d_output], name='dis_model')
    self.opt_discriminator = Adam(lr=2e-4)
    self.discriminator.compile(loss='binary_crossentropy',
                               optimizer=self.opt_discriminator)

Is line continuation with backslash dangerous in Python?

I understand that current best practice for line continuation is to use implied continuation inside parenthesis. For example:

a = (1 + 2
     + 3 + 4)

From PEP8 (https://www.python.org/dev/peps/pep-0008/):

The preferred way of wrapping long lines is by using Python’s implied line continuation inside parentheses, brackets and braces. Long lines can be broken over multiple lines by wrapping expressions in parentheses. These should be used in preference to using a backslash for line continuation.

I intend to follow this convention going forward, however, my question regards how worried I should be about bugs in existing code that continues lines with a backslash:

a = 1 + 2 \
    + 3 + 4

The python docs (https://docs.python.org/2.7/howto/doanddont.html#using-backslash-to-continue-statements) warn that a stray space at the end of a line after the backslash can make the code “subtly wrong,” however, as pointed out in this question (How can using Python backslash line continuation be subtly wrong?), the example given simply results in a SyntaxError being raised which is not a subtle problem as it is easily identified. My question, then, is do there exist cases where continuing a line with a backslash causes something worse than a parsing error? I am interested in examples where a mistake in backslash continuations yields a runtime exception or, worse, code that runs silently but with unintended behavior.

Which is the better method of Histogram Equalisation? OpenCV/Python

I have written a program in Python that takes an image dataset of frontal views faces taken in lab settings (blank background etc.) and then classifies them using sklearn’s SVC. To create the feature vectors I use as the training and validation sets that I pass into SVC, the first step is to apply histogram equalisation to the images. In my testing, I have used both:

clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
clahe_image = clahe.apply(gray)

and

image = cv2.equalizeHist(gray)

Where gray is a grayscale image of a face. I have read about both of these methods in the documentation and online, but in practice, I have not seen much (if any) difference in performance.

Is there any reason I should pick one of these methods over the other?