Получение ближайшей даты до заданной даты

Учитывая эту базовую дату:

base_date = "10/29 06:58 AM" 

Я хочу найти кортеж в списке, который содержит ближайшую дату для base_date , но он не должен быть более ранней датой.

 list_date = [('10/30 02:18 PM', '-103', '-107'), ('10/30 02:17 PM', '+100', '-110'), \ ('10/29 02:15 AM', '-101', '-109') 

поэтому здесь вывод должен быть ('10/30 02:17 PM', '+100', '-110') (это не может быть 3-й кортеж, потому что дата там была раньше базовой даты)

Мой вопрос: существует ли какой-либо модуль для такого сравнения даты? Я попытался сначала изменить данные на формат AM , а затем сравнить, но мой код становится уродливым с большим количеством разрезов.

@редактировать:

Большой список для тестирования:

 [('10/30 02:18 PM', '+13 -103', '-13 -107'), ('10/30 02:17 PM', '+13 +100', '-13 -110'), ('10/30 02:15 PM', '+13 -101', '-13 -109'), ('10/30 02:14 PM', '+13 -103', '-13 -107'), ('10/30 01:59 PM', '+13 -105', '-13 -105'), ('10/30 01:46 PM', '+13 -106', '-13 -104'), ('10/30 01:37 PM', '+13 -105', '-13 -105'), ('10/30 01:24 PM', '+13 -107', '-13 -103'), ('10/30 01:23 PM', '+13 -106', '-13 -104'), ('10/30 01:05 PM', '+13 -103', '-13 -107'), ('10/30 01:02 PM', '+13 -104', '-13 -106'), ('10/30 12:55 PM', '+13 -103', '-13 -107'), ('10/30 12:51 PM', '+13.5 -110', '-13.5 +100'), ('10/30 12:44 PM', '+13.5 -108', '-13.5 -102'), ('10/30 12:38 PM', '+13.5 -107', '-13.5 -103'), ('10/30 12:35 PM', '+13 -102', '-13 -108'), ('10/30 12:34 PM', '+13 -103', '-13 -107'), ('10/30 12:06 PM', '+13.5 -110', '-13.5 +100'), ('10/30 11:57 AM', '+13.5 -108', '-13.5 -102'), ('10/30 11:36 AM', '+13.5 -107', '-13.5 -103'), ('10/30 09:01 AM', '+13.5 -110', '-13.5 +100'), ('10/30 08:59 AM', '+13.5 -108', '-13.5 -102'), ('10/30 08:13 AM', '+13.5 -105', '-13.5 -105'), ('10/30 06:11 AM', '+13.5 +100', '-13.5 -110'), ('10/30 06:09 AM', '+13.5 -105', '-13.5 -105'), ('10/30 06:04 AM', '+13.5 -110', '-13.5 +100'), ('10/30 05:32 AM', '+13.5 -105', '-13.5 -105'), ('10/30 04:48 AM', '+13.5 -107', '-13.5 -103'), ('10/30 12:51 AM', '+13.5 -110', '-13.5 +100'), ('10/29 01:31 PM', '+13.5 -105', '-13.5 -105'), ('10/29 01:31 PM', '+13 +103', '-13 -113'), ('10/29 01:28 PM', '+13 -102', '-13 -108'), ('10/29 07:59 AM', '+13 -105', '-13 -105'), ('10/29 07:20 AM', '+13 -103', '-13 -107'), ('10/29 07:14 AM', '+13 -105', '-13 -105'), ('10/29 04:47 AM', '+13 +100', '-13 -110'), ('10/29 04:14 AM', '+13 -105', '-13 -105'), ('10/28 08:17 PM', '+12.5 +100', '-12.5 -110'), ('10/28 12:52 PM', '+12.5 -105', '-12.5 -105')] 

Большой список для test2:

 [('10/30 04:30 PM', '+1.5 -111', '-1.5 +101'), ('10/30 04:24 PM', '+1.5 -110', '-1.5 +100'), ('10/30 04:21 PM', '+1.5 -111', '-1.5 +101'), ('10/30 04:15 PM', '+1.5 -112', '-1.5 +102'), ('10/30 04:14 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:57 PM', '+1.5 -111', '-1.5 +101'), ('10/30 03:40 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:31 PM', '+1.5 -111', '-1.5 +101'), ('10/30 03:30 PM', '+1.5 -109', '-1.5 -101'), ('10/30 03:25 PM', '+1.5 -107', '-1.5 -103'), ('10/30 03:24 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:23 PM', '+1.5 -108', '-1.5 -102'), ('10/30 03:22 PM', '+1.5 -106', '-1.5 -104'), ('10/30 02:14 PM', '+1.5 -104', '-1.5 -106'), ('10/30 01:41 PM', '+1.5 -105', '-1.5 -105'), ('10/30 01:37 PM', '+1.5 -107', '-1.5 -103'), ('10/30 01:36 PM', '+1.5 -105', '-1.5 -105'), ('10/30 01:06 PM', '+1.5 -103', '-1.5 -107'), ('10/30 12:56 PM', '+2 -111', '-2 +101'), ('10/30 12:53 PM', '+2 -110', '-2 +100'), ('10/30 12:50 PM', '+2 -113', '-2 +103'), ('10/30 12:49 PM', '+2 -112', '-2 +102'), ('10/30 12:46 PM', '+2 -113', '-2 +103'), ('10/30 12:45 PM', '+2 -110', '-2 +100'), ('10/30 12:43 PM', '+2 -108', '-2 -102'), ('10/30 12:38 PM', '+2.5 -116', '-2.5 +106'), ('10/30 12:38 PM', '+2.5 -113', '-2.5 +103'), ('10/30 12:37 PM', '+2.5 -110', '-2.5 +100'), ('10/30 10:30 AM', '+2.5 -105', '-2.5 -105'), ('10/30 10:07 AM', '+3 -113', '-3 +103'), ('10/30 09:55 AM', '+3 -112', '-3 +102'), ('10/30 09:51 AM', '+3 -110', '-3 +100'), ('10/30 09:32 AM', '+3 -109', '-3 -101'), ('10/30 06:04 AM', '+3 -110', '-3 +100'), ('10/30 03:16 AM', '+3 -107', '-3 -103'), ('10/30 03:14 AM', '+3.5 -116', '-3.5 +106'), ('10/30 01:03 AM', '+3.5 -115', '-3.5 +105'), ('10/30 12:17 AM', '+3.5 -110', '-3.5 +100'), ('10/29 08:52 PM', '+3.5 -108', '-3.5 -102'), ('10/29 01:31 PM', '+3.5 -105', '-3.5 -105'), ('10/29 06:48 AM', '+3.5 -110', '-3.5 +100'), ('10/29 06:47 AM', '+3.5 -109', '-3.5 -101'), ('10/29 05:39 AM', '+3.5 -113', '-3.5 +103'), ('10/29 03:34 AM', '+3.5 -108', '-3.5 -102'), ('10/29 12:44 AM', '+3.5 -110', '-3.5 +100'), ('10/29 12:41 AM', '+3.5 -107', '-3.5 -103'), ('10/29 12:40 AM', '+3.5 -105', '-3.5 -105'), ('10/28 12:52 PM', '+4 -105', '-4 -105')] 

 >>> from datetime import timedelta, datetime >>> base_date = "10/29 06:58 AM" >>> b_d = datetime.strptime(base_date, "%m/%d %I:%M %p") def func(x): d = datetime.strptime(x[0], "%m/%d %I:%M %p") delta = d - b_d if d > b_d else timedelta.max return delta ... >>> min(list_date, key = func) ('10/30 02:17 PM', '+100', '-110') 

datetime.strptime преобразует дату в объект datetime, поэтому b_d теперь выглядит примерно так:

 >>> b_d datetime.datetime(1900, 10, 29, 6, 58) 

Теперь мы можем написать функцию, которая может быть передана key параметру min :

 delta = d - b_d if d > b_d else timedelta.max 

если d > b_d т. е. если дата, d > b_d min , больше, чем base_date то назначьте их разницу для delta else, присвойте ей timedelta.max .

 >>> timedelta.max datetime.timedelta(999999999, 86399, 999999) 

Обновить:

 >>> from datetime import timedelta, datetime >>> base_date = '10/29 06:59 AM' >>> b_d = datetime.strptime(base_date, "%m/%d %I:%M %p") >>> def func(x): ... d = datetime.strptime(x[0], "%m/%d %I:%M %p") ... delta = d - b_d if d > b_d else timedelta.max ... return delta ... >>> lis2 = [('10/30 04:30 PM', '+1.5 -111', '-1.5 +101'), ('10/30 04:24 PM', '+1.5 -110', '-1.5 +100'), ('10/30 04:21 PM', '+1.5 -111', '-1.5 +101'), ('10/30 04:15 PM', '+1.5 -112', '-1.5 +102'), ('10/30 04:14 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:57 PM', '+1.5 -111', '-1.5 +101'), ('10/30 03:40 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:31 PM', '+1.5 -111', '-1.5 +101'), ('10/30 03:30 PM', '+1.5 -109', '-1.5 -101'), ('10/30 03:25 PM', '+1.5 -107', '-1.5 -103'), ('10/30 03:24 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:23 PM', '+1.5 -108', '-1.5 -102'), ('10/30 03:22 PM', '+1.5 -106', '-1.5 -104'), ('10/30 02:14 PM', '+1.5 -104', '-1.5 -106'), ('10/30 01:41 PM', '+1.5 -105', '-1.5 -105'), ('10/30 01:37 PM', '+1.5 -107', '-1.5 -103'), ('10/30 01:36 PM', '+1.5 -105', '-1.5 -105'), ('10/30 01:06 PM', '+1.5 -103', '-1.5 -107'), ('10/30 12:56 PM', '+2 -111', '-2 +101'), ('10/30 12:53 PM', '+2 -110', '-2 +100'), ('10/30 12:50 PM', '+2 -113', '-2 +103'), ('10/30 12:49 PM', '+2 -112', '-2 +102'), ('10/30 12:46 PM', '+2 -113', '-2 +103'), ('10/30 12:45 PM', '+2 -110', '-2 +100'), ('10/30 12:43 PM', '+2 -108', '-2 -102'), ('10/30 12:38 PM', '+2.5 -116', '-2.5 +106'), ('10/30 12:38 PM', '+2.5 -113', '-2.5 +103'), ('10/30 12:37 PM', '+2.5 -110', '-2.5 +100'), ('10/30 10:30 AM', '+2.5 -105', '-2.5 -105'), ('10/30 10:07 AM', '+3 -113', '-3 +103'), ('10/30 09:55 AM', '+3 -112', '-3 +102'), ('10/30 09:51 AM', '+3 -110', '-3 +100'), ('10/30 09:32 AM', '+3 -109', '-3 -101'), ('10/30 06:04 AM', '+3 -110', '-3 +100'), ('10/30 03:16 AM', '+3 -107', '-3 -103'), ('10/30 03:14 AM', '+3.5 -116', '-3.5 +106'), ('10/30 01:03 AM', '+3.5 -115', '-3.5 +105'), ('10/30 12:17 AM', '+3.5 -110', '-3.5 +100'), ('10/29 08:52 PM', '+3.5 -108', '-3.5 -102'), ('10/29 01:31 PM', '+3.5 -105', '-3.5 -105'), ('10/29 06:48 AM', '+3.5 -110', '-3.5 +100'), ('10/29 06:47 AM', '+3.5 -109', '-3.5 -101'), ('10/29 05:39 AM', '+3.5 -113', '-3.5 +103'), ('10/29 03:34 AM', '+3.5 -108', '-3.5 -102'), ('10/29 12:44 AM', '+3.5 -110', '-3.5 +100'), ('10/29 12:41 AM', '+3.5 -107', '-3.5 -103'), ('10/29 12:40 AM', '+3.5 -105', '-3.5 -105'), ('10/28 12:52 PM', '+4 -105', '-4 -105')] >>> min(lis2, key = func) ('10/29 01:31 PM', '+3.5 -105', '-3.5 -105') 

Сроки сравнения:

Автор сценария:

 from datetime import datetime, timedelta import sys import time list_date = [('10/30 04:30 PM', '+1.5 -111', '-1.5 +101'), ('10/30 04:24 PM', '+1.5 -110', '-1.5 +100'), ('10/30 04:21 PM', '+1.5 -111', '-1.5 +101'), ('10/30 04:15 PM', '+1.5 -112', '-1.5 +102'), ('10/30 04:14 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:57 PM', '+1.5 -111', '-1.5 +101'), ('10/30 03:40 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:31 PM', '+1.5 -111', '-1.5 +101'), ('10/30 03:30 PM', '+1.5 -109', '-1.5 -101'), ('10/30 03:25 PM', '+1.5 -107', '-1.5 -103'), ('10/30 03:24 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:23 PM', '+1.5 -108', '-1.5 -102'), ('10/30 03:22 PM', '+1.5 -106', '-1.5 -104'), ('10/30 02:14 PM', '+1.5 -104', '-1.5 -106'), ('10/30 01:41 PM', '+1.5 -105', '-1.5 -105'), ('10/30 01:37 PM', '+1.5 -107', '-1.5 -103'), ('10/30 01:36 PM', '+1.5 -105', '-1.5 -105'), ('10/30 01:06 PM', '+1.5 -103', '-1.5 -107'), ('10/30 12:56 PM', '+2 -111', '-2 +101'), ('10/30 12:53 PM', '+2 -110', '-2 +100'), ('10/30 12:50 PM', '+2 -113', '-2 +103'), ('10/30 12:49 PM', '+2 -112', '-2 +102'), ('10/30 12:46 PM', '+2 -113', '-2 +103'), ('10/30 12:45 PM', '+2 -110', '-2 +100'), ('10/30 12:43 PM', '+2 -108', '-2 -102'), ('10/30 12:38 PM', '+2.5 -116', '-2.5 +106'), ('10/30 12:38 PM', '+2.5 -113', '-2.5 +103'), ('10/30 12:37 PM', '+2.5 -110', '-2.5 +100'), ('10/30 10:30 AM', '+2.5 -105', '-2.5 -105'), ('10/30 10:07 AM', '+3 -113', '-3 +103'), ('10/30 09:55 AM', '+3 -112', '-3 +102'), ('10/30 09:51 AM', '+3 -110', '-3 +100'), ('10/30 09:32 AM', '+3 -109', '-3 -101'), ('10/30 06:04 AM', '+3 -110', '-3 +100'), ('10/30 03:16 AM', '+3 -107', '-3 -103'), ('10/30 03:14 AM', '+3.5 -116', '-3.5 +106'), ('10/30 01:03 AM', '+3.5 -115', '-3.5 +105'), ('10/30 12:17 AM', '+3.5 -110', '-3.5 +100'), ('10/29 08:52 PM', '+3.5 -108', '-3.5 -102'), ('10/29 01:31 PM', '+3.5 -105', '-3.5 -105'), ('10/29 06:48 AM', '+3.5 -110', '-3.5 +100'), ('10/29 06:47 AM', '+3.5 -109', '-3.5 -101'), ('10/29 05:39 AM', '+3.5 -113', '-3.5 +103'), ('10/29 03:34 AM', '+3.5 -108', '-3.5 -102'), ('10/29 12:44 AM', '+3.5 -110', '-3.5 +100'), ('10/29 12:41 AM', '+3.5 -107', '-3.5 -103'), ('10/29 12:40 AM', '+3.5 -105', '-3.5 -105'), ('10/28 12:52 PM', '+4 -105', '-4 -105')] base_date = "10/29 06:58 AM" def func1(list_date): #http://stackoverflow.com/a/17249420/846892 get_datetime = lambda s: datetime.strptime(s, "%m/%d %I:%M %p") base = get_datetime(base_date) later = filter(lambda d: get_datetime(d[0]) > base, list_date) return min(later, key = lambda d: get_datetime(d[0])) def func2(list_date): #http://stackoverflow.com/a/17249470/846892 b_d = datetime.strptime(base_date, "%m/%d %I:%M %p") def func(x): d = datetime.strptime(x[0], "%m/%d %I:%M %p") delta = d - b_d if d > b_d else timedelta.max return delta return min(list_date, key = func) def func3(list_date): #http://stackoverflow.com/a/17249529/846892 fmt = '%m/%d %I:%M %p' d = datetime.strptime(base_date, fmt) def foo(x): return (datetime.strptime(x[0],fmt)-d).total_seconds() > 0 return sorted(list_date, key=foo)[-1] def func4(list_date): #http://stackoverflow.com/a/17249441/846892 fmt = '%m/%d %I:%M %p' base_d = datetime.strptime(base_date, fmt) candidates = ((datetime.strptime(d, fmt), d, x, y) for d, x, y in list_date) candidates = min((dt, d, x, y) for dt, d, x, y in candidates if dt > base_d) return candidates[1:] 

Результаты:

 >>> from so import * #check output irst >>> func1(list_date) ('10/29 01:31 PM', '+3.5 -105', '-3.5 -105') >>> func2(list_date) ('10/29 01:31 PM', '+3.5 -105', '-3.5 -105') >>> func3(list_date) ('10/29 01:31 PM', '+3.5 -105', '-3.5 -105') >>> func4(list_date) ('10/29 01:31 PM', '+3.5 -105', '-3.5 -105') >>> %timeit func1(list_date) 100 loops, best of 3: 3.07 ms per loop >>> %timeit func2(list_date) 100 loops, best of 3: 1.59 ms per loop #winner >>> %timeit func3(list_date) 100 loops, best of 3: 1.91 ms per loop >>> %timeit func4(list_date) 1000 loops, best of 3: 2.02 ms per loop #increase the input size >>> list_date = list_date *10**3 >>> len(list_date) 48000 >>> %timeit func1(list_date) 1 loops, best of 3: 3.6 s per loop >>> %timeit func2(list_date) #winner 1 loops, best of 3: 1.99 s per loop >>> %timeit func3(list_date) 1 loops, best of 3: 2.09 s per loop >>> %timeit func4(list_date) 1 loops, best of 3: 2.02 s per loop #increase the input size again >>> list_date = list_date *10 >>> len(list_date) 480000 >>> %timeit func1(list_date) 1 loops, best of 3: 36.4 s per loop >>> %timeit func2(list_date) #winner 1 loops, best of 3: 20.2 s per loop >>> %timeit func3(list_date) 1 loops, best of 3: 22.8 s per loop >>> %timeit func4(list_date) 1 loops, best of 3: 22.7 s per loop 

Это можно сделать с помощью модуля datetime , который способен анализировать строку даты в объект datetime, который поддерживает сравнение и арифметику с датами:

 from datetime import datetime # function for parsing strings using specific format get_datetime = lambda s: datetime.strptime(s, "%m/%d %I:%M %p") base = get_datetime(base_date) later = filter(lambda d: get_datetime(d[0]) > base, list_date) closest_date = min(later, key = lambda d: get_datetime(d[0])) 

Линейный поиск?

 import sys import time base_date = "10/29 06:58 AM" def str_to_my_time(my_str): return time.mktime(time.strptime(my_str, "%m/%d %I:%M %p")) # assume year 1900... base_dt = str_to_my_time(base_date) list_date = [('10/30 02:18 PM', '-103', '-107'), ('10/30 02:17 PM', '+100', '-110'), ('10/29 02:15 AM', '-101', '-109')] best_delta = sys.maxint best_match = None for t in list_date: the_dt = str_to_my_time(t[0]) delta_sec = the_dt - base_dt if (delta_sec >= 0) and (delta_sec < best_delta): best_delta = delta_sec best_match = t print best_match, best_delta 

Производство:

 ('10/30 02:17 PM', '+100', '-110') 112740.0 
 import time import sys #The Function def to_sec(date_string): return time.mktime(time.strptime(date_string, '%m/%d %I:%M %p')) #The Test base_date = "10/29 06:58 AM" base_date_sec = to_sec(base_date) result = None difference = sys.maxint list_date = [ ('10/30 02:18 PM', '-103', '-107'), ('10/30 02:17 PM', '+100', '-110'), ('10/29 02:15 AM', '-101', '-109') ] for date_str in list_date: diff_sec = to_sec(date_str[0])-base_date_sec if diff_sec >= 0 and diff_sec < difference: result = date_str difference = diff_sec print result 
 import datetime fmt = '%m/%d %H:%M %p' d = datetime.datetime.strptime(base_date, fmt) def foo(x): return (datetime.datetime.strptime(x[0],fmt)-d).total_seconds() > 0 sorted(list_date, key=foo)[-1] 

украсить, фильтровать, найти ближайшую дату, unecorate

 >>> base_date = "10/29 06:58 AM" >>> list_date = [ ... ('10/30 02:18 PM', '-103', '-107'), ... ('10/30 02:17 PM', '+100', '-110'), ... ('10/29 02:15 AM', '-101', '-109') ... ] >>> import datetime >>> fmt = '%m/%d %H:%M %p' >>> base_d = datetime.datetime.strptime(base_date, fmt) >>> candidates = ((datetime.datetime.strptime(d, fmt), d, x, y) for d, x, y in list_date) >>> candidates = min((dt, d, x, y) for dt, d, x, y in candidates if dt > base_d) >>> print candidates[1:] ('10/30 02:17 PM', '+100', '-110') 

Вы можете рассмотреть возможность добавления списка дат в индекс Pandas, а затем использовать функцию «truncate» или «get_loc».

 import pandas as pd ##Initial inputs list_date = [('10/30 02:18 PM', '-103', '-107'),('10/29 02:15 AM', '-101', '-109') , ('10/30 02:17 PM', '+100', '-110'), \ ] # reordered to show the method is input order insensitive base_date = "10/29 06:58 AM" ##Make a data frame with data df=pd.DataFrame(list_date) df.columns=['date','val1','val2'] dateIndex=pd.to_datetime(df['date'], format='%m/%d %I:%M %p') df=df.set_index(dateIndex) df=df.sort_index(ascending=False) #earliest comes on top ##Find the result base_dateObj=pd.to_datetime(base_date, format='%m/%d %I:%M %p') result=df.truncate(after=base_dateObj).iloc[-1] #take the bottom value, or the 1st after the base date (result['date'],result['val1'], result['val2']) # result is ('10/30 02:17 PM', '+100', '-110') 

Ссылка: эта ссылка