logo
down
shadow

How to reduce a data with the longest string under pandas framework?


How to reduce a data with the longest string under pandas framework?

By : user2956017
Date : November 22 2020, 10:48 AM
I hope this helps . The code should work. BTW, you don't need to wrap f1 inside another lambda. Just pass f1. (They have exactly same parameter signature)
code :
>>> import pandas as pd
>>>
>>> def f1(s):
...     return max(s, key=len)
...
>>> data = pd.DataFrame([
...     {'id': 'GB', 'name': '"United Kingdom"'},
...     {'id': 'GB', 'name': 'England'},
...     {'id': 'US', 'name': '"United States"'},
...     {'id': 'US', 'name': 'America'},
...
... ])
>>> data.groupby('id').agg({'name': f1})
                name
id
GB  "United Kingdom"
US   "United States"


Share : facebook icon twitter icon
Find length of longest string in Pandas dataframe column

Find length of longest string in Pandas dataframe column


By : G-Wiz
Date : March 29 2020, 07:55 AM
it fixes the issue DSM's suggestion seems to be about the best you're going to get without doing some manual microoptimization:
code :
%timeit -n 100 df.col1.str.len().max()
100 loops, best of 3: 11.7 ms per loop

%timeit -n 100 df.col1.map(lambda x: len(x)).max()
100 loops, best of 3: 16.4 ms per loop

%timeit -n 100 df.col1.map(len).max()
100 loops, best of 3: 10.1 ms per loop
How do I count the longest consecutive '0' flanked by number '1' in is string using pandas dataframe

How do I count the longest consecutive '0' flanked by number '1' in is string using pandas dataframe


By : user2964407
Date : March 29 2020, 07:55 AM
help you fix your problem You can try of using string related operations with split and count
Convert column to string -> followed by split with '1' and counting the max
code :
df['result'] = df.label.astype(str).str.split('1').apply(lambda x: len(max(x)))
   Id      label  result
0   1   1         0
1   2   11        0
2   3   101       1
3   4   10101     1
4   5   1001      2
Pandas - Find longest streak of string values in column together with row id

Pandas - Find longest streak of string values in column together with row id


By : user3384393
Date : March 29 2020, 07:55 AM
hope this fix your issue We can use Series.cumsum + Series.shift to create groups according to consecutive names(see detail). Then you can use GroupBy.agg to create a dataframe with the size of each group. ,the first index and datetime value of each group. Sort the dataframe by size using DataFrame.sort_values ​​and remove duplicates (You can use DataFrame.drop_duplicates) to remove groups with the same name and smaller size. Convert the columns to str. (You may need to convert Datetime also if your actual data is not str). Then you can use Series.str.cat to join the columns. Finally, we can use Series.to_dict + DataFrame.set_index to obtaind the dictionary
code :
groups=df['Name'].ne(df['Name'].shift()).cumsum()
df_agg= (   df.groupby(groups,sort=False).agg(Name=('Name','first'),
                                              Datemin=('Datetime','first'),
                                              length=('Name','size'),
                                              idxmin=('ID','idxmin'))
              .sort_values('length',ascending=False)
              .drop_duplicates('Name')
        )


df_agg['j1']=df_agg['length'].astype(str).str.cat(df_agg['idxmin'].astype(str),sep=',')
df_agg['j']=df_agg['j1'].str.cat(df_agg['Datemin'],sep=' or ')
print(df_agg)

        Name  length  idxmin Datemin   j1             j
Name                                                  
4     Esther       3       4   Date5  3,4  3,4 or Date5
1     Harald       2       0   Date1  2,0  2,0 or Date1
3      Steve       1       3   Date4  1,3  1,3 or Date4
my_dict=df_agg.set_index('Name')['j'].to_dict()
print(my_dict)
{'Esther': '3,4 or Date5', 'Harald': '2,0 or Date1', 'Steve': '1,3 or Date4'}
print(groups)

0    1
1    1
2    2
3    3
4    4
5    4
6    4
Name: Name, dtype: int64
pandas dataframe longest series with uninterrupted data

pandas dataframe longest series with uninterrupted data


By : Jos Arturo Llanquihu
Date : March 29 2020, 07:55 AM
I wish this help you You can first make a temporary DataFrame in which each series of uninterrupted data is labeled with a (per column) unique number. And place the 'original' NaN's back so the longest series cant be a series of NaN's.
How to find the longest consecutive string of values in pandas dataframe

How to find the longest consecutive string of values in pandas dataframe


By : Sagar bobade
Date : March 29 2020, 07:55 AM
hop of those help? I'm looking to find the longest string of zeros in my pandas df. I have a df array with 10 columns, each with 25000 rows that have either a null, a zero or a non-zero number. I am looking to calculate: , Setup
Consider the dataframe df
Related Posts Related Posts :
  • Appending a column in .csv with Python/Pandas
  • How to change my result directory in Robot framework using RIDE?
  • problem with using pandas to manipulate a big text file in python
  • python-magic module' object has no attribute 'open'
  • Where goes wrong for this High Pass Filter in Python?
  • Why inserting keys in order into a python dict is faster than doint it unordered
  • flann index saving in python
  • Create new instance of list or dictionary without class
  • How can I easily convert FORTRAN code to Python code (real code, not wrappers)
  • Address of lambda function in python
  • Python adding space between characters in string. Most efficient way
  • python http server, multiple simultaneous requests
  • Disguising username & password on distributed python scripts
  • Post GraphQL mutation with Python Requests
  • Why doesnt pandas create an excel file?
  • Rolling comparison between a value and a past window, with percentile/quantile
  • How to avoid repetitive code when defining a new type in python with signature verification
  • How to configure uWSGI in order to debug with pdb (--honour-stdin configuration issue)
  • In Python, how do you execute objects that are functions from a list?
  • Python- Variable Won't Subtract?
  • Processing Power In Python
  • Python 2.7.2 - Cannot import name _random or random from sys
  • Why doesn't the Python sorted function take keyword order instead of reverse?
  • Make a function redirect to other functions depending on a variable
  • get_absolute_url in django-categories
  • Monitoring non-Celery background task with New Relic in Python
  • Feature selection with LinearSVC
  • LSTM - Predicting the same constant values after a while
  • Test the length of elements in a list
  • Django: render radiobutton with 3 columns, cost column must change according to size & quantity selected
  • Python class attributes vs global variable
  • sys.stdout.writelines("hello") and sys.stdout.write("hello")
  • is ndarray faster than recarray access?
  • Python - search through directory trees, rename certain files
  • GAE: How to build a query where a string begins with a value
  • TypeError: __init__() takes at least 2 arguments (1 given)
  • Overriding and customizing "django.contrib.auth.views.login"
  • Django : Redirect to a particular page after login
  • Python search and copy files in directory
  • pretty printing numpy ndarrays using unicode characters
  • Frequent pattern mining in Python
  • How can I make a set of functions that can be used synchronously as well as asynchronously?
  • Convert one dice roll to two dice roll
  • count occourrence in a list
  • Writing an If condition to filter out the first word
  • to read file and compare column in python
  • Install python-numpy in the Virtualenv environment
  • `.select_by_visible_text()` is failed to select element?
  • Unable to send data multiple requests in a single connection — socket error
  • Pandas HDFStore unload dataframe from memory
  • Creating a custom admin view
  • How do you get the user role of the currently logged in user in Ckan?
  • Speed up Numpy Meshgrid Command
  • Python error - name lengths
  • appending text to a global variable
  • Python Mistake - Number of letters in name
  • Searching for a sequence in a text
  • Testing logging output with pytest
  • How do I change my default working directory for Python (Anaconda) on VSCode?
  • .lower() for x in list, not working, but works in another scenario
  • shadow
    Privacy Policy - Terms - Contact Us © ourworld-yourmove.org